Hypothesis Testing

The purpose of Hypothesis testing is to check the claims about a population distribution as well confirm the changes that could be brought in. Introduction of new techniques or practices in in our daily life activities like commerce, health and education needs to be confirmed that they bring in improvement.
Hypothesis testing is a statistical tool that helps in arriving at a decision using sample data to evaluate the population parameter and accept or reject the opinions under study. The inferences over the claim is arrived by systematic investigation of concerned population parameters.
This write up will provide an overview on the terms used in hypothetical testing , the steps involved and the different types of test used.

What is Hypothesis Testing?

A statistical hypothesis testing is a decision making process which is based on systematic evaluation of population parameter. The statistical hypothesis which are tested are conjectures on population parameters.

These conjectures may or may not be true. There are two types of hypothesis.
  1. The null hypothesis is a statement which says that there is no difference between a parameter and a specific value, or there is no difference between two parameters and it is represented by the symbol $H_{o}$.
  2. The alternate hypothesis is a statement which says that there is a difference between a parameter and a specific value or between two parameters and is denoted by the symbol $H_{a}$.
Three primary approaches are employed in doing a hypothesis test.

  1. The traditional method of using the test statistic.
  2. The p-value method
  3. The confidence interval method.

Hypothesis Testing Examples   

Let us consider the case of testing using t-test for a small sample and the population standard deviation not known.

It was general belief the mean weight of a fish caught in a lake =2.5 Kg. But Pat claimed that the mean weight is less than this value. A sample of five fish caught in the lake weighed on an average 2.2 Kg with a standard deviation of 0.08 Kg. Test the claim of Pat using a significance level = 0.1.

Step 1: State the null and alternate hypotheses.
$H_{o}:\mu \geqslant 2.5$
$H_{a}: \mu < 2.5$ (Claim)

Step 2: Find the critical value.
This is a left tailed test, as the alternate hypothesis has < sign.
We need to conduct a t-test as the sample size =5 < 30.
Using the t-table the critical value at α = 0.1 and degree of freedom = n - 1 = 4.
$t_{\alpha =0.1 df =4}$ = -1.533

Step 3: Calculate the test statistic.
$t_{test}=\frac{\overline{x}-\mu }{\frac{s}{\sqrt{n}}}$ where x = 2.2, μ = 2.5, s = 0.08 and n =5
$t_{test}=\frac{2.2-2.5}{\frac{0.08}{\sqrt{5}}}$ = -8.38

Step 4: Make a decision.

Hypothesis Testing
The calculated test statistic = -8.385 <-1.533 the critical value.
The test statistic falls in the critical region.

Hence the null hypothesis is rejected.



Step 5: Conclusion.
At α = 0.1, there is sufficient evidence to support the claim that the average weight of fish caught in the lake is less than 2.5 Kg.


Hypothesis Testing Steps   

The exact test to be done and the approach to be used are determined based on sample data. The 5 steps which are followed generally in hypothesis testing are as followed:

Traditional Method
p-value method
Confidence Interval Method
  1. State the hypotheses and identify the claim.
  2. Find the critical value
  3. Compute the test statistic.
  4. Make the decision to reject or not to reject the null hypothesis
  5. summarize the results
  1. State the hypotheses and identify the claim.
  2. Compute the test value.
  3. Find the P-value
  4. Make the decision
  5. Summarize the results
  1. State the hypotheses and identify the claim.
  2. Compute the maximum error.
  3. Find the confidence interval
  4. Make the decision
  5. Summarize the results.

This procedure will be elaborated using examples during the course of discussion.

Null Hypothesis Testing

When a hypothesis test is conducted, we can expect four possible outcomes related to the true state of the null hypothesis.

  1. A true null hypothesis may be rejected. We commit here a Type I error.
  2. A true null hypothesis is not rejected. No error occurs and correct decision.
  3. A false null hypothesis is not rejected. We commit a Type II error.
  4. A false null hypothesis is rejected. No error occurs and correct decision.

Null Hypothesis Testing


How confident are we, in making the correct decision when we reject or do no reject a null hypothesis? The level of significance denoted by 'α' is used for answering this question. The level of significance is the probability of Type I error.

If α is given to be equal to 0.05, then there is a 5% chance of rejecting a true null hypothesis.
Type II error is denoted by β a measure which cannot be computed easily in many situations, while the level of significance α is decided by the researcher.

Multiple Hypothesis Testing   

A hypothetical population claimed to have a mean of 27.5. To test this claim a sample of 100 elements is collected with a mean =26. If the standard deviation of the population is known to be 6.5. Draw an inference on the claim using an appropriate hypothesis test using the significance level α = 0.05

Step1: State the Hypotheses.

$H_{o}: \mu =27.5$ (Claim)
$H_{a}: \mu \neq 27.5$

Step 2: Find the critical value.
Since the alternate hypothesis has ≠ sign, this is two way test.
Since we have a large sample with n =100, z-test is appropriate for the situation.
The Critical value for level of significance α = 0.05 is $z_{t}=\pm 1.96$

Step 3: Calculate the test statistic.

$z_{t}=\frac{\overline{x}-\mu }{\frac{\sigma }{\sqrt{n}}}$ where x = 26, μ = 27.5 and σ = 6.5.

$z_{t}=\frac{26-27.5}{\frac{6.5}{\sqrt{100}}}$ = -2.31

Step 4: Make a decision.
A sketch is shown showing the the critical region and the position of the test statistic.

Multiple Hypothesis Testing The calculated test statistic -2.31 < -1.96 the critical value.
The test statistic falls in the critical region as seen in the
diagram.

Hence the null hypothesis is rejected.




Step 5: Conclusion

At α = 0.05, there is sufficient evidence to reject the claim that the population mean = 27.5.

One Sample Hypothesis Testing

Z- test is used to test the claim on population mean using the mean of a sample from the population

  • When the population standard deviation σ is known.
  • or when the sample size is large that is n ≥ 30.

t-test is used when the sample size is small and the population standard deviation is not known.

The following flow chart shows, when to use z or t tests and also the formulas used for computing the test statistic.

One Sample Hypothesis Testing
The formula to be used for testing population proportions of large sample is $z=\frac{\overline{p}-p}{\sqrt{\frac{pq}{n}}}$ where

$\overline{p}=\frac{x}{n}$ (Sample proportion)
p = population proportion and n = sample size.

Hypothesis Testing with Two Samples   

The inference on two population means, variances or proportions can be done using the sample of the the population. The testing is done for two types of situations.
  1. Independent inference: The samples are not related.
  2. Dependent Inference: The samples are related. The sample of the second population is determined from the first.

The different formulas used to compute the test statistic for independent inferences are shown below:

Formulas for Hypothesis testing (Independent inferences)
Comparing the means of two large samples.
Both $n_{1}\geq 30$ and $n_{2}\geq 30$
Population variances are known
$Z_{t}=\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu _{1}-\mu _{2})}{\sqrt{\frac{\sigma _{1}^{2}}{n_{1}}+\frac{\sigma _{2}^{2}}{n_{2}}}}$
Comparing the means of two large samples.
Both $n_{1}\geq 30$ and $n_{2}\geq 30$
Population variances are not known
$Z_{t}=\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu _{1}-\mu _{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}$
Comparing the means for small samples.
Either or both $n_{1}$ and $n_{2}$ < 30.
Population variances are assumed to be unequal.
$t_{0}=\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu _{1}-\mu _{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}$
Comparing the means for small samples.
Either or both $n_{1}$ and $n_{2}$ < 30.
Population variances are assumed to be equal.
$t_{0}=\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu _{1}-\mu _{2})}{\sqrt{\frac{(n_{1}-1)s_{1}^{2}+(n_{2}-1)s_{2}^{2}}{n_{1}+n_{2}-2}}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}}$

The t -test formula for testing two small dependent sample is given by
$t_{0}= \frac{\overline{d}}{\frac{s_{d}}{\sqrt{n}}}$ where $\overline{d}$ and $s_{d}$ are the mean and the standard deviation of the difference distribution.

5 Step Hypothesis Testing

The third approach used in hypothesis testing make use of the confidence interval for the true mean. The confidence interval is found for the sample data and and if the confidence interval contains the mean, the null hypothesis is not rejected.

Let us test the example given for large sample using confidence interval method.

A hypothetical population claimed to have a mean of 27.5. To test this claim a sample of 100 elements is collected with a mean =26. If the standard deviation of the population is known to be 6.5. Draw an inference on the claim using an appropriate hypothesis test using the significance level α = 0.05
Step1: State the Hypotheses.

$H_{o}: \mu =27.5$ (Claim)
$H_{a}: \mu \neq 27.5$

Step 2: Compute the maximum error using the formula,
$E=Z_{\frac{\alpha }{2}}(\frac{\sigma }{\sqrt{n}})$ where $Z_{\frac{\alpha }{2}}$ is the critical value at α = 0.05 and = 1.96
$E=1.96(\frac{6.5}{\sqrt{100}})$ = 1.274

Step 3:
Find the confidence interval.
The confidence interval for the true mean X -E < μ < X + E
26 - 1.274 < μ < 26 + 1.274
24.726 < μ < 27.274
Step 4: Make a decision.
As the confidence interval does not include the value of population mean = 27.5, the null hypothesis is rejected.

Step 5: Conclusion.
At α = 0.05, there is sufficient evidence to reject the claim that the population mean = 27.5

Hypothesis Testing P Value   

Definition of p-value
The p-value is the probability of getting a sample statistic in the direction of alternate hypothesis when the null hypothesis is true.

Based on this definition, in p-value hypothesis testing, the null hypothesis is rejected when the computed p-value ≤ α.
Steps to find the p-value
  1. Find the test value.
  2. Use the appropriate table to find the area corresponding the the test value.
  3. Subtract the area found from 0.5
  4. If the test is one sided then the p value = value found in step 3. For a two sided test the pa value = 2 x the value got in step 3.

Example:
The average production of peanut is hypothetically reported to be 3050 pounds per acre. After using a new brand fertilizer, on 100 individual farming lands, the mean yield per acre was found to be 3180 pounds per acre with a standard deviation of 564 pounds. At α = 0.05 can it be concluded that the use new fertilizer result in the increase of production of peanuts?


Step 1: State the Hypotheses
$H_{o}: \mu \leqslant 3050$
$H_{a}:\mu > 3050$ (Claim)

Step 2: Compute the test value

Large sample and right tailed test as the alternate hypothesis contains the symbol >. Z test value is to be calculated using the formula
$Z_{t}= \frac{\overline{x}-\mu }{\frac{S}{\sqrt{n}}}$ where x = 3180, μ = 3050, S = 564 and n = 100
$Z_{t}= \frac{3180-3050}{\frac{564}{\sqrt{100}}}$ = 2.30

Step 3: Find the p- value
Using the z-score table, the area corresponding to the test value 2.30 = 0.4893.
p value for the right tailed test = 0.5 - 0.4893 = 0.0107

Step 4: Make a decision
computed p value = 0.0107 < 0.05 = α. Hence the decision is to reject the null hypothesis.

Step 5: Conclusion
At α = 0.05 there is sufficient evidence to support the claim that the use of new fertilizer is resulted in increase in production of peanuts.


Hypothesis Testing for Proportions

Just as we test the claims on population parameters like mean and standard deviation, Hypothesis testing is also done to test the description of certain proportion of the population. The z-test can be used to test the proportions of large samples when np > 5 and nq >5, where p is the population proportion and q = 1 -p.
The formula used for determining the test statistic is

$z=\frac{\widehat{p}-p}{\sqrt{\frac{pq}{n}}}$ where $\widehat{p}= \frac{x}{n}$ the proportion obtained from the sample.

Example:

Jerome claimed that 60% of students in US universities are male. You felt that this proportion is somewhat high. You found out of 1732 students 998 students were male. Test the claim of Jerome at a level of significance α = 0.05.

The proportion for male students p = 0.6. Hence q = 1 - p= 0.4 Both np and nq are greater than 5. Hence the z test for proportions can be done.

Step 1: State the Hypotheses.
$H_{o}: p\geq 0.6$ (Claim)
$H_{a}: p < 0.6$ To test the null hypothesis at α = 0.05 Left tailed test

Step 2: Calculate the test value
$z=\frac{\widehat{p}-p}{\sqrt{\frac{pq}{n}}}$
$Z_{t}=\frac{\frac{998}{1732}-0.6}{\sqrt{\frac{(0.6)(0.4)}{1732}}} $ = -2.02

Step 3: Find the p- value
For the left tailed test the p value = 0.5 - 0.4783 = 0.0217

Step 4: Make a decision
The p value 0.0217 < 0.05. Hence the null hypothesis is rejected.

Step 5: Conclusion
At 0.05 significance level, there is sufficient evidence to reject the claim that 60% of university students are male.

Bayesian Hypothesis Testing

Bayesian approach of hypothesis testing makes use of Bayes theorem and it is useful in testing situations where there can be more than two hypotheses under consideration ( multiple hypothesis testing). The rejection of hypothesis/ses are based on the posterior probabilities calculated.

Suppose there are only two hypotheses $H_{0}$ and $H_{1}$ to be tested. The probabilities of each hypothesis for a sample X are found as

$P(H_{0}|X)= \frac{P(X|H_{0})P(H_{0})}{P(X)}$ and $P(H_{1}|X)= \frac{P(X|H_{1})P(H_{1})}{P(X)}$

In the above case $P(H_{1}|X)= 1 - P(H_{0}|X)$ and $P(X)= P(X|H_{0})P(H_{0})+P(X|H_{1})P(H_{1})$

In general if there are n hypotheses

$P(H_{i}|X)= \frac{P(X|H_{i})P(H_{i}))}{\sum_{i=1}^{n}P(X|H_{i})P(H_{i})}$

Each of $P(H_{i}|X)$ is known as a posterior probability. The hypothesis with the largest posterior probability is accepted.

Critical Value

In the traditional approach of hypothesis testing, the critical value and critical region play the decisive role. The critical value is determined by the level of significance and it separates the critical region from the non critical region.

The critical region is the range for the test vale which will indicate a significant difference and the null hypothesis is to be rejected. The critical region is found on one tail or on two tails of the graph of the distribution depending upon the test is one tailed or two tailed.

In a one tailed test, the critical region falls on one side of the mean. The null hypothesis is rejected when the calculated test value falls in this critical region. A one tailed test is called either a right tailed test or a left tailed test, depending upon the direction of the inequality in the alternate hypothesis.

In a two tailed test, the critical region falls on either side the mean and the null hypothesis is rejected when the critical value falls on either of these regions. If the alternate hypothesis contains a ≠ sign, then we have a two tailed test to conduct.

The one tailed and two tailed critical regions are diagrammatically shown for a significance level α = 0.01. This significance level tells 1% of the total area is to be marked as critical region. For a right tailed test, the critical region is shaded on the right tail while it is shaded on the left tail for the left tailed test. For two tailed test, this area is distributed on either tail, each region measuring 0.5% of the total area. In the example the critical values which cut the region/s off are shown for a z-test. The z-values found in the table corresponding to the area are taken as the critical values.

Critical regions for level of significance α = 0.01

Critical Value For one tailed test the proportionate area of the critical region = the level of
significance = 0.01. Hence the area of the non critical region on the same side
of the mean = 0.5 - 0.01 = 0.498. The z-score corresponding to this area as
seen in the table = 2.33. Hence the critical value for a right tailed test = 2.33
and the same for a left tailed test = -2.33.

For a two tailed test, the critical region is distributed evenly on either side of the
mean. Hence the area of the critical region on one tail = `0.01/2` = 0.005.
This means the area of the non critical region on the same side = 0.5- 0.005
= 4.95. We can see the z-score corresponding to this area = 2.575. Hence the
two critical values for the two tailed test are -2.575 and +2.575.
Finding Critical Value For the right tailed test the critical value = 2.33 and the critical region is shaded
to its right. If the calculated test value is greater than 2.33, the null hypothesis
is rejected.

How to find Critical Value For the left tailed test, the critical value = -2.33 and the critical region is shaded
to its left. If the calculated test value is less than - 2.33, the null hypothesis
is rejected.

Calculating Critical Value For the two tailed test, the two critical values are -2.575 and +2.575. If the test
statistic is less than -2.575 or greater than +2.575, the null hypothesis is rejected.

The different of types of test will be discussed with examples during the course of this discussion.