9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

Statology

Statistics Made Easy

What is an Alternative Hypothesis in Statistics?

Often in statistics we want to test whether or not some assumption is true about a population parameter .

For example, we might assume that the mean weight of a certain population of turtle is 300 pounds.

To determine if this assumption is true, we’ll go out and collect a sample of turtles and weigh each of them. Using this sample data, we’ll conduct a hypothesis test .

The first step in a hypothesis test is to define the  null and  alternative hypotheses .

These two hypotheses need to be mutually exclusive, so if one is true then the other must be false.

These two hypotheses are defined as follows:

Null hypothesis (H 0 ): The sample data is consistent with the prevailing belief about the population parameter.

Alternative hypothesis (H A ): The sample data suggests that the assumption made in the null hypothesis is not true. In other words, there is some non-random cause influencing the data.

Types of Alternative Hypotheses

There are two types of alternative hypotheses:

A  one-tailed hypothesis involves making a “greater than” or “less than ” statement. For example, suppose we assume the mean height of a male in the U.S. is greater than or equal to 70 inches.

The null and alternative hypotheses in this case would be:

  • Null hypothesis: µ ≥ 70 inches
  • Alternative hypothesis: µ < 70 inches

A  two-tailed hypothesis involves making an “equal to” or “not equal to” statement. For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches.

  • Null hypothesis: µ = 70 inches
  • Alternative hypothesis: µ ≠ 70 inches

Note: The “equal” sign is always included in the null hypothesis, whether it is =, ≥, or ≤.

Examples of Alternative Hypotheses

The following examples illustrate how to define the null and alternative hypotheses for different research problems.

Example 1: A biologist wants to test if the mean weight of a certain population of turtle is different from the widely-accepted mean weight of 300 pounds.

The null and alternative hypothesis for this research study would be:

  • Null hypothesis: µ = 300 pounds
  • Alternative hypothesis: µ ≠ 300 pounds

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean weight of this population of turtles is different from 300 pounds.

Example 2: An engineer wants to test whether a new battery can produce higher mean watts than the current industry standard of 50 watts.

  • Null hypothesis: µ ≤ 50 watts
  • Alternative hypothesis: µ > 50 watts

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean watts produced by the new battery is greater than the current industry standard of 50 watts.

Example 3: A botanist wants to know if a new gardening method produces less waste than the standard gardening method that produces 20 pounds of waste.

  • Null hypothesis: µ ≥ 20 pounds
  • Alternative hypothesis: µ < 20 pounds

If we reject the null hypothesis, this means we have sufficient evidence from the sample data to say that the true mean weight produced by this new gardening method is less than 20 pounds.

When to Reject the Null Hypothesis

Whenever we conduct a hypothesis test, we use sample data to calculate a test-statistic and a corresponding p-value.

If the p-value is less than some significance level (common choices are 0.10, 0.05, and 0.01), then we reject the null hypothesis.

This means we have sufficient evidence from the sample data to say that the assumption made by the null hypothesis is not true.

If the p-value is  not less than some significance level, then we fail to reject the null hypothesis.

This means our sample data did not provide us with evidence that the assumption made by the null hypothesis was not true.

Additional Resource:   An Explanation of P-Values and Statistical Significance

' src=

Published by Zach

Leave a reply cancel reply.

Your email address will not be published. Required fields are marked *

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing

Examples of null and alternative hypotheses

  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations
  • Using P-values to make conclusions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base
  • Null and Alternative Hypotheses | Definitions & Examples

Null and Alternative Hypotheses | Definitions & Examples

Published on 5 October 2022 by Shaun Turney . Revised on 6 December 2022.

The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test :

  • Null hypothesis (H 0 ): There’s no effect in the population .
  • Alternative hypothesis (H A ): There’s an effect in the population.

The effect is usually the effect of the independent variable on the dependent variable .

Table of contents

Answering your research question with hypotheses, what is a null hypothesis, what is an alternative hypothesis, differences between null and alternative hypotheses, how to write null and alternative hypotheses, frequently asked questions about null and alternative hypotheses.

The null and alternative hypotheses offer competing answers to your research question . When the research question asks “Does the independent variable affect the dependent variable?”, the null hypothesis (H 0 ) answers “No, there’s no effect in the population.” On the other hand, the alternative hypothesis (H A ) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of hypothesis testing is to make inferences about a population based on a sample . Often, we infer whether there’s an effect in the population by looking at differences between groups or relationships between variables in the sample.

You can use a statistical test to decide whether the evidence favors the null or alternative hypothesis. Each type of statistical test comes with a specific way of phrasing the null and alternative hypothesis. However, the hypotheses can also be phrased in a general way that applies to any test.

The null hypothesis is the claim that there’s no effect in the population.

If the sample provides enough evidence against the claim that there’s no effect in the population ( p ≤ α), then we can reject the null hypothesis . Otherwise, we fail to reject the null hypothesis.

Although “fail to reject” may sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove” or “accept” the null hypothesis.

Null hypotheses often include phrases such as “no effect”, “no difference”, or “no relationship”. When written in mathematical terms, they always include an equality (usually =, but sometimes ≥ or ≤).

Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more than one way to answer a research question, but these null hypotheses can help you get started.

*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be fine to say that daily meditation has no effect on the incidence of depression and p 1 = p 2 .

The alternative hypothesis (H A ) is the other answer to your research question . It claims that there’s an effect in the population.

Often, your alternative hypothesis is the same as your research hypothesis. In other words, it’s the claim that you expect or hope will be true.

The alternative hypothesis is the complement to the null hypothesis. Null and alternative hypotheses are exhaustive, meaning that together they cover every possible outcome. They are also mutually exclusive, meaning that only one can be true at a time.

Alternative hypotheses often include phrases such as “an effect”, “a difference”, or “a relationship”. When alternative hypotheses are written in mathematical terms, they always include an inequality (usually ≠, but sometimes > or <). As with null hypotheses, there are many acceptable ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get started with formulating your own.

Null and alternative hypotheses are similar in some ways:

  • They’re both answers to the research question
  • They both make claims about the population
  • They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the following table.

To help you write your hypotheses, you can use the template sentences below. If you know which statistical test you’re going to use, you can use the test-specific template sentences. Otherwise, you can use the general template sentences.

The only thing you need to know to use these general template sentences are your dependent and independent variables. To write your research question, null hypothesis, and alternative hypothesis, fill in the following sentences with your variables:

Does independent variable affect dependent variable ?

  • Null hypothesis (H 0 ): Independent variable does not affect dependent variable .
  • Alternative hypothesis (H A ): Independent variable affects dependent variable .

Test-specific

Once you know the statistical test you’ll be using, you can write your hypotheses in a more precise and mathematical way specific to the test you chose. The table below provides template sentences for common statistical tests.

Note: The template sentences above assume that you’re performing one-tailed tests . One-tailed tests are appropriate for most studies.

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2022, December 06). Null and Alternative Hypotheses | Definitions & Examples. Scribbr. Retrieved 20 March 2024, from https://www.scribbr.co.uk/stats/null-and-alternative-hypothesis/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, levels of measurement: nominal, ordinal, interval, ratio, the standard normal distribution | calculator, examples & uses, types of variables in research | definitions & examples.

Module 9: Hypothesis Testing With One Sample

Null and alternative hypotheses, learning outcomes.

  • Describe hypothesis testing in general and in practice

The actual test begins by considering two  hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 : The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H a : The alternative hypothesis : It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make adecision. There are two options for a  decision . They are “reject H 0 ” if the sample information favors the alternative hypothesis or “do not reject H 0 ” or “decline to reject H 0 ” if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in  H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ 30

H a : More than 30% of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

H 0 : The drug reduces cholesterol by 25%. p = 0.25

H a : The drug does not reduce cholesterol by 25%. p ≠ 0.25

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

H 0 : μ = 2.0

H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 66 H a : μ __ 66

  • H 0 : μ = 66
  • H a : μ ≠ 66

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

H 0 : μ ≥ 5

H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : μ __ 45 H a : μ __ 45

  • H 0 : μ ≥ 45
  • H a : μ < 45

In an issue of U.S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

H 0 : p ≤ 0.066

H a : p > 0.066

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses. H 0 : p __ 0.40 H a : p __ 0.40

  • H 0 : p = 0.40
  • H a : p > 0.40

Concept Review

In a  hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we: Evaluate the null hypothesis , typically denoted with H 0 . The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality (=, ≤ or ≥) Always write the alternative hypothesis , typically denoted with H a or H 1 , using less than, greater than, or not equals symbols, i.e., (≠, >, or <). If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis. Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.

Formula Review

H 0 and H a are contradictory.

  • OpenStax, Statistics, Null and Alternative Hypotheses. Provided by : OpenStax. Located at : http://cnx.org/contents/[email protected]:58/Introductory_Statistics . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]
  • Simple hypothesis testing | Probability and Statistics | Khan Academy. Authored by : Khan Academy. Located at : https://youtu.be/5D1gV37bKXY . License : All Rights Reserved . License Terms : Standard YouTube License

Alternative hypothesis

by Marco Taboga , PhD

In a statistical test, observed data is used to decide whether or not to reject a restriction on the data-generating probability distribution.

The assumption that the restriction is true is called null hypothesis , while the statement that the restriction is not true is called alternative hypothesis.

A correct specification of the alternative hypothesis is essential to decide between one-tailed and two-tailed tests.

Table of contents

Mathematical setting

Choice between one-tailed and two-tailed tests, the critical region, the interpretation of the rejection, the interpretation must be coherent with the alternative hypothesis.

  • Power function

Accepting the alternative

More details, keep reading the glossary.

In order to fully understand the concept of alternative hypothesis, we need to remember the essential elements of a statistical inference problem:

we observe a sample drawn from an unknown probability distribution;

in principle, any valid probability distribution could have generated the sample;

however, we usually place some a priori restrictions on the set of possible data-generating distributions;

A couple of simple examples follow.

When we conduct a statistical test, we formulate a null hypothesis as a restriction on the statistical model.

[eq1]

The alternative hypothesis is

[eq2]

The alternative hypothesis is used to decide whether a test should be one-tailed or two-tailed.

The null hypothesis is rejected if the test statistic falls within a critical region that has been chosen by the statistician.

The critical region is a set of values that may comprise:

only the left tail of the distribution or only the right tail (one-tailed test);

both the left and the right tail (two-tailed test).

The choice of the critical region depends on the alternative hypothesis. Let us see why.

The interpretation is different depending on the tail of the distribution in which the test statistic falls.

[eq7]

The choice between a one-tailed or a two-tailed test needs to be done in such a way that the interpretation of a rejection is always coherent with the alternative hypothesis.

When we deal with the power function of a test, the term "alternative hypothesis" has a special meaning.

[eq10]

We conclude with a caveat about the interpretation of the outcome of a test of hypothesis.

The interpretation of a rejection of the null is controversial.

According to some statisticians, rejecting the null is equivalent to accepting the alternative.

However, others deem that rejecting the null does not necessarily imply accepting the alternative. In fact, it is possible to think of situations in which both hypotheses can be rejected. Let us see why.

According to the conceptual framework illustrated by the images above, there are three possibilities:

the null is true;

the alternative is true;

neither the null nor the alternative is true because the true data-generating distribution has been excluded from the statistical model (we say that the model is mis-specified).

If we are in case 3, accepting the alternative after a rejection of the null is an incorrect decision. Moreover, a second test in which the alternative becomes the new null may lead us to another rejection.

There are three cases, including one case in which it is incorrect to accept the alternative hypothesis after a rejection of the null.

You can find more details about the alternative hypothesis in the lecture on Hypothesis testing .

Previous entry: Almost sure

Next entry: Binomial coefficient

How to cite

Please cite as:

Taboga, Marco (2021). "Alternative hypothesis", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/alternative-hypothesis.

Most of the learning materials found on this website are now available in a traditional textbook format.

  • Maximum likelihood
  • Binomial distribution
  • Beta distribution
  • Convergence in probability
  • Delta method
  • Exponential distribution
  • Chi-square distribution
  • Set estimation
  • Wishart distribution
  • Mathematical tools
  • Fundamentals of probability
  • Probability distributions
  • Asymptotic theory
  • Fundamentals of statistics
  • About Statlect
  • Cookies, privacy and terms of use
  • Precision matrix
  • Loss function
  • Integrable variable
  • Critical value
  • To enhance your privacy,
  • we removed the social buttons,
  • but don't forget to share .

Logo for UH Pressbooks

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Hypothesis Testing with One Sample

Null and Alternative Hypotheses

OpenStaxCollege

[latexpage]

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 : The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

H a : The alternative hypothesis: It is a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are “reject H 0 ” if the sample information favors the alternative hypothesis or “do not reject H 0 ” or “decline to reject H 0 ” if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

H 0 : No more than 30% of the registered voters in Santa Clara County voted in the primary election. p ≤ 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.

H 0 : The drug reduces cholesterol by 25%. p = 0.25

H a : The drug does not reduce cholesterol by 25%. p ≠ 0.25

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are:

H 0 : μ = 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ = 66
  • H a : μ ≠ 66

We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are:

H 0 : μ ≥ 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ ≥ 45
  • H a : μ < 45

In an issue of U. S. News and World Report , an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses.

H 0 : p ≤ 0.066

On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p = 0.40
  • H a : p > 0.40

<!– ??? –>

Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

Chapter Review

In a hypothesis test , sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we:

Formula Review

H 0 and H a are contradictory.

If α ≤ p -value, then do not reject H 0 .

If α > p -value, then reject H 0 .

α is preconceived. Its value is set before the hypothesis test starts. The p -value is calculated from the data.

You are testing that the mean speed of your cable Internet connection is more than three Megabits per second. What is the random variable? Describe in words.

The random variable is the mean Internet speed in Megabits per second.

You are testing that the mean speed of your cable Internet connection is more than three Megabits per second. State the null and alternative hypotheses.

The American family has an average of two children. What is the random variable? Describe in words.

The random variable is the mean number of children an American family has.

The mean entry level salary of an employee at a company is 💲58,000. You believe it is higher for IT professionals in the company. State the null and alternative hypotheses.

A sociologist claims the probability that a person picked at random in Times Square in New York City is visiting the area is 0.83. You want to test to see if the proportion is actually less. What is the random variable? Describe in words.

The random variable is the proportion of people picked at random in Times Square visiting the city.

A sociologist claims the probability that a person picked at random in Times Square in New York City is visiting the area is 0.83. You want to test to see if the claim is correct. State the null and alternative hypotheses.

In a population of fish, approximately 42% are female. A test is conducted to see if, in fact, the proportion is less. State the null and alternative hypotheses.

Suppose that a recent article stated that the mean time spent in jail by a first–time convicted burglar is 2.5 years. A study was then done to see if the mean time has increased in the new century. A random sample of 26 first-time convicted burglars in a recent year was picked. The mean length of time in jail from the survey was 3 years with a standard deviation of 1.8 years. Suppose that it is somehow known that the population standard deviation is 1.5. If you were conducting a hypothesis test to determine if the mean length of jail time has increased, what would the null and alternative hypotheses be? The distribution of the population is normal.

A random survey of 75 death row inmates revealed that the mean length of time on death row is 17.4 years with a standard deviation of 6.3 years. If you were conducting a hypothesis test to determine if the population mean time on death row could likely be 15 years, what would the null and alternative hypotheses be?

  • H 0 : __________
  • H a : __________
  • H 0 : μ = 15
  • H a : μ ≠ 15

The National Institute of Mental Health published an article stating that in any one-year period, approximately 9.5 percent of American adults suffer from depression or a depressive illness. Suppose that in a survey of 100 people in a certain town, seven of them suffered from depression or a depressive illness. If you were conducting a hypothesis test to determine if the true proportion of people in that town suffering from depression or a depressive illness is lower than the percent in the general adult American population, what would the null and alternative hypotheses be?

Some of the following statements refer to the null hypothesis, some to the alternate hypothesis.

State the null hypothesis, H 0 , and the alternative hypothesis. H a , in terms of the appropriate parameter ( μ or p ).

  • The mean number of years Americans work before retiring is 34.
  • At most 60% of Americans vote in presidential elections.
  • The mean starting salary for San Jose State University graduates is at least 💲100,000 per year.
  • Twenty-nine percent of high school seniors get drunk each month.
  • Fewer than 5% of adults ride the bus to work in Los Angeles.
  • The mean number of cars a person owns in her lifetime is not more than ten.
  • About half of Americans prefer to live away from cities, given the choice.
  • Europeans have a mean paid vacation each year of six weeks.
  • The chance of developing breast cancer is under 11% for women.
  • Private universities’ mean tuition cost is more than 💲20,000 per year.
  • H 0 : μ = 34; H a : μ ≠ 34
  • H 0 : p ≤ 0.60; H a : p > 0.60
  • H 0 : μ ≥ 100,000; H a : μ < 100,000
  • H 0 : p = 0.29; H a : p ≠ 0.29
  • H 0 : p = 0.05; H a : p < 0.05
  • H 0 : μ ≤ 10; H a : μ > 10
  • H 0 : p = 0.50; H a : p ≠ 0.50
  • H 0 : μ = 6; H a : μ ≠ 6
  • H 0 : p ≥ 0.11; H a : p < 0.11
  • H 0 : μ ≤ 20,000; H a : μ > 20,000

Over the past few decades, public health officials have examined the link between weight concerns and teen girls’ smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin? The alternative hypothesis is:

  • p < 0.30
  • p > 0.30

A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 attended the midnight showing. An appropriate alternative hypothesis is:

  • p > 0.20
  • p < 0.20

Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test. The null and alternative hypotheses are:

  • H o : \(\overline{x}\) = 4.5, H a : \(\overline{x}\) > 4.5
  • H o : μ ≥ 4.5, H a : μ < 4.5
  • H o : μ = 4.75, H a : μ > 4.75
  • H o : μ = 4.5, H a : μ > 4.5

Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm.

Null and Alternative Hypotheses Copyright © 2013 by OpenStaxCollege is licensed under a Creative Commons Attribution 4.0 International License , except where otherwise noted.

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

8.4: The Alternative Hypothesis

  • Last updated
  • Save as PDF
  • Page ID 14493

  • Foster et al.
  • University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus via University of Missouri’s Affordable and Open Access Educational Resources Initiative

If the null hypothesis is rejected, then we will need some other explanation, which we call the alternative hypothesis, \(H_A\) or \(H_1\). The alternative hypothesis is simply the reverse of the null hypothesis, and there are three options, depending on where we expect the difference to lie. Thus, our alternative hypothesis is the mathematical way of stating our research question. If we expect our obtained sample mean to be above or below the null hypothesis value, which we call a directional hypothesis, then our alternative hypothesis takes the form:

\[\mathrm{H}_{\mathrm{A}}: \mu>7.47 \quad \text { or } \quad \mathrm{H}_{\mathrm{A}}: \mu<7.47 \nonumber \]

based on the research question itself. We should only use a directional hypothesis if we have good reason, based on prior observations or research, to suspect a particular direction. When we do not know the direction, such as when we are entering a new area of research, we use a non-directional alternative:

\[\mathrm{H}_{\mathrm{A}}: \mu \neq 7.47 \nonumber \]

We will set different criteria for rejecting the null hypothesis based on the directionality (greater than, less than, or not equal to) of the alternative. To understand why, we need to see where our criteria come from and how they relate to \(z\)-scores and distributions.

Writing hypotheses in words

As we alluded to in the null hypothesis section, we can write our hypotheses in word statements (in addition to the statements with symbols). These statements should be specific enough to the particular experiment or situation being referred to. That is, don't make them generic enough so that they would apply to any hypothesis test that you would conduct. 

Examples for how to write null and alternate hypotheses in words for directional and non-directional situations are given throughout the chapters. 

Contributors and Attributions

Foster et al.  (University of Missouri-St. Louis, Rice University, & University of Houston, Downtown Campus)

Null Hypothesis and Alternative Hypothesis

  • Inferential Statistics
  • Statistics Tutorials
  • Probability & Games
  • Descriptive Statistics
  • Applications Of Statistics
  • Math Tutorials
  • Pre Algebra & Algebra
  • Exponential Decay
  • Worksheets By Grade
  • Ph.D., Mathematics, Purdue University
  • M.S., Mathematics, Purdue University
  • B.A., Mathematics, Physics, and Chemistry, Anderson University

Hypothesis testing involves the careful construction of two statements: the null hypothesis and the alternative hypothesis. These hypotheses can look very similar but are actually different.

How do we know which hypothesis is the null and which one is the alternative? We will see that there are a few ways to tell the difference.

The Null Hypothesis

The null hypothesis reflects that there will be no observed effect in our experiment. In a mathematical formulation of the null hypothesis, there will typically be an equal sign. This hypothesis is denoted by H 0 .

The null hypothesis is what we attempt to find evidence against in our hypothesis test. We hope to obtain a small enough p-value that it is lower than our level of significance alpha and we are justified in rejecting the null hypothesis. If our p-value is greater than alpha, then we fail to reject the null hypothesis.

If the null hypothesis is not rejected, then we must be careful to say what this means. The thinking on this is similar to a legal verdict. Just because a person has been declared "not guilty", it does not mean that he is innocent. In the same way, just because we failed to reject a null hypothesis it does not mean that the statement is true.

For example, we may want to investigate the claim that despite what convention has told us, the mean adult body temperature is not the accepted value of 98.6 degrees Fahrenheit . The null hypothesis for an experiment to investigate this is “The mean adult body temperature for healthy individuals is 98.6 degrees Fahrenheit.” If we fail to reject the null hypothesis, then our working hypothesis remains that the average adult who is healthy has a temperature of 98.6 degrees. We do not prove that this is true.

If we are studying a new treatment, the null hypothesis is that our treatment will not change our subjects in any meaningful way. In other words, the treatment will not produce any effect in our subjects.

The Alternative Hypothesis

The alternative or experimental hypothesis reflects that there will be an observed effect for our experiment. In a mathematical formulation of the alternative hypothesis, there will typically be an inequality, or not equal to symbol. This hypothesis is denoted by either H a or by H 1 .

The alternative hypothesis is what we are attempting to demonstrate in an indirect way by the use of our hypothesis test. If the null hypothesis is rejected, then we accept the alternative hypothesis. If the null hypothesis is not rejected, then we do not accept the alternative hypothesis. Going back to the above example of mean human body temperature, the alternative hypothesis is “The average adult human body temperature is not 98.6 degrees Fahrenheit.”

If we are studying a new treatment, then the alternative hypothesis is that our treatment does, in fact, change our subjects in a meaningful and measurable way.

The following set of negations may help when you are forming your null and alternative hypotheses. Most technical papers rely on just the first formulation, even though you may see some of the others in a statistics textbook.

  • Null hypothesis: “ x is equal to y .” Alternative hypothesis “ x is not equal to y .”
  • Null hypothesis: “ x is at least y .” Alternative hypothesis “ x is less than y .”
  • Null hypothesis: “ x is at most y .” Alternative hypothesis “ x is greater than y .”
  • An Example of a Hypothesis Test
  • Hypothesis Test for the Difference of Two Population Proportions
  • What Is a P-Value?
  • How to Conduct a Hypothesis Test
  • Hypothesis Test Example
  • Chi-Square Goodness of Fit Test
  • How to Do Hypothesis Tests With the Z.TEST Function in Excel
  • The Difference Between Type I and Type II Errors in Hypothesis Testing
  • Type I and Type II Errors in Statistics
  • The Runs Test for Random Sequences
  • What 'Fail to Reject' Means in a Hypothesis Test
  • What Is the Difference Between Alpha and P-Values?
  • An Example of Chi-Square Test for a Multinomial Experiment
  • Null Hypothesis Definition and Examples
  • What Is a Hypothesis? (Science)
  • Null Hypothesis Examples

Frequently asked questions

What are null and alternative hypotheses.

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Frequently asked questions: Statistics

As the degrees of freedom increase, Student’s t distribution becomes less leptokurtic , meaning that the probability of extreme values decreases. The distribution becomes more and more similar to a standard normal distribution .

The three categories of kurtosis are:

  • Mesokurtosis : An excess kurtosis of 0. Normal distributions are mesokurtic.
  • Platykurtosis : A negative excess kurtosis. Platykurtic distributions are thin-tailed, meaning that they have few outliers .
  • Leptokurtosis : A positive excess kurtosis. Leptokurtic distributions are fat-tailed, meaning that they have many outliers.

Probability distributions belong to two broad categories: discrete probability distributions and continuous probability distributions . Within each category, there are many types of probability distributions.

Probability is the relative frequency over an infinite number of trials.

For example, the probability of a coin landing on heads is .5, meaning that if you flip the coin an infinite number of times, it will land on heads half the time.

Since doing something an infinite number of times is impossible, relative frequency is often used as an estimate of probability. If you flip a coin 1000 times and get 507 heads, the relative frequency, .507, is a good estimate of the probability.

Categorical variables can be described by a frequency distribution. Quantitative variables can also be described by a frequency distribution, but first they need to be grouped into interval classes .

A histogram is an effective way to tell if a frequency distribution appears to have a normal distribution .

Plot a histogram and look at the shape of the bars. If the bars roughly follow a symmetrical bell or hill shape, like the example below, then the distribution is approximately normally distributed.

Frequency-distribution-Normal-distribution

You can use the CHISQ.INV.RT() function to find a chi-square critical value in Excel.

For example, to calculate the chi-square critical value for a test with df = 22 and α = .05, click any blank cell and type:

=CHISQ.INV.RT(0.05,22)

You can use the qchisq() function to find a chi-square critical value in R.

For example, to calculate the chi-square critical value for a test with df = 22 and α = .05:

qchisq(p = .05, df = 22, lower.tail = FALSE)

You can use the chisq.test() function to perform a chi-square test of independence in R. Give the contingency table as a matrix for the “x” argument. For example:

m = matrix(data = c(89, 84, 86, 9, 8, 24), nrow = 3, ncol = 2)

chisq.test(x = m)

You can use the CHISQ.TEST() function to perform a chi-square test of independence in Excel. It takes two arguments, CHISQ.TEST(observed_range, expected_range), and returns the p value.

Chi-square goodness of fit tests are often used in genetics. One common application is to check if two genes are linked (i.e., if the assortment is independent). When genes are linked, the allele inherited for one gene affects the allele inherited for another gene.

Suppose that you want to know if the genes for pea texture (R = round, r = wrinkled) and color (Y = yellow, y = green) are linked. You perform a dihybrid cross between two heterozygous ( RY / ry ) pea plants. The hypotheses you’re testing with your experiment are:

  • This would suggest that the genes are unlinked.
  • This would suggest that the genes are linked.

You observe 100 peas:

  • 78 round and yellow peas
  • 6 round and green peas
  • 4 wrinkled and yellow peas
  • 12 wrinkled and green peas

Step 1: Calculate the expected frequencies

To calculate the expected values, you can make a Punnett square. If the two genes are unlinked, the probability of each genotypic combination is equal.

The expected phenotypic ratios are therefore 9 round and yellow: 3 round and green: 3 wrinkled and yellow: 1 wrinkled and green.

From this, you can calculate the expected phenotypic frequencies for 100 peas:

Step 2: Calculate chi-square

Χ 2 = 8.41 + 8.67 + 11.6 + 5.4 = 34.08

Step 3: Find the critical chi-square value

Since there are four groups (round and yellow, round and green, wrinkled and yellow, wrinkled and green), there are three degrees of freedom .

For a test of significance at α = .05 and df = 3, the Χ 2 critical value is 7.82.

Step 4: Compare the chi-square value to the critical value

Χ 2 = 34.08

Critical value = 7.82

The Χ 2 value is greater than the critical value .

Step 5: Decide whether the reject the null hypothesis

The Χ 2 value is greater than the critical value, so we reject the null hypothesis that the population of offspring have an equal probability of inheriting all possible genotypic combinations. There is a significant difference between the observed and expected genotypic frequencies ( p < .05).

The data supports the alternative hypothesis that the offspring do not have an equal probability of inheriting all possible genotypic combinations, which suggests that the genes are linked

You can use the chisq.test() function to perform a chi-square goodness of fit test in R. Give the observed values in the “x” argument, give the expected values in the “p” argument, and set “rescale.p” to true. For example:

chisq.test(x = c(22,30,23), p = c(25,25,25), rescale.p = TRUE)

You can use the CHISQ.TEST() function to perform a chi-square goodness of fit test in Excel. It takes two arguments, CHISQ.TEST(observed_range, expected_range), and returns the p value .

Both correlations and chi-square tests can test for relationships between two variables. However, a correlation is used when you have two quantitative variables and a chi-square test of independence is used when you have two categorical variables.

Both chi-square tests and t tests can test for differences between two groups. However, a t test is used when you have a dependent quantitative variable and an independent categorical variable (with two groups). A chi-square test of independence is used when you have two categorical variables.

The two main chi-square tests are the chi-square goodness of fit test and the chi-square test of independence .

A chi-square distribution is a continuous probability distribution . The shape of a chi-square distribution depends on its degrees of freedom , k . The mean of a chi-square distribution is equal to its degrees of freedom ( k ) and the variance is 2 k . The range is 0 to ∞.

As the degrees of freedom ( k ) increases, the chi-square distribution goes from a downward curve to a hump shape. As the degrees of freedom increases further, the hump goes from being strongly right-skewed to being approximately normal.

To find the quartiles of a probability distribution, you can use the distribution’s quantile function.

You can use the quantile() function to find quartiles in R. If your data is called “data”, then “quantile(data, prob=c(.25,.5,.75), type=1)” will return the three quartiles.

You can use the QUARTILE() function to find quartiles in Excel. If your data is in column A, then click any blank cell and type “=QUARTILE(A:A,1)” for the first quartile, “=QUARTILE(A:A,2)” for the second quartile, and “=QUARTILE(A:A,3)” for the third quartile.

You can use the PEARSON() function to calculate the Pearson correlation coefficient in Excel. If your variables are in columns A and B, then click any blank cell and type “PEARSON(A:A,B:B)”.

There is no function to directly test the significance of the correlation.

You can use the cor() function to calculate the Pearson correlation coefficient in R. To test the significance of the correlation, you can use the cor.test() function.

You should use the Pearson correlation coefficient when (1) the relationship is linear and (2) both variables are quantitative and (3) normally distributed and (4) have no outliers.

The Pearson correlation coefficient ( r ) is the most common way of measuring a linear correlation. It is a number between –1 and 1 that measures the strength and direction of the relationship between two variables.

This table summarizes the most important differences between normal distributions and Poisson distributions :

When the mean of a Poisson distribution is large (>10), it can be approximated by a normal distribution.

In the Poisson distribution formula, lambda (λ) is the mean number of events within a given interval of time or space. For example, λ = 0.748 floods per year.

The e in the Poisson distribution formula stands for the number 2.718. This number is called Euler’s constant. You can simply substitute e with 2.718 when you’re calculating a Poisson probability. Euler’s constant is a very useful number and is especially important in calculus.

The three types of skewness are:

  • Right skew (also called positive skew ) . A right-skewed distribution is longer on the right side of its peak than on its left.
  • Left skew (also called negative skew). A left-skewed distribution is longer on the left side of its peak than on its right.
  • Zero skew. It is symmetrical and its left and right sides are mirror images.

Skewness of a distribution

Skewness and kurtosis are both important measures of a distribution’s shape.

  • Skewness measures the asymmetry of a distribution.
  • Kurtosis measures the heaviness of a distribution’s tails relative to a normal distribution .

Difference between skewness and kurtosis

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (“ x affects y because …”).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses . In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

The alternative hypothesis is often abbreviated as H a or H 1 . When the alternative hypothesis is written using mathematical symbols, it always includes an inequality symbol (usually ≠, but sometimes < or >).

The null hypothesis is often abbreviated as H 0 . When the null hypothesis is written using mathematical symbols, it always includes an equality symbol (usually =, but sometimes ≥ or ≤).

The t distribution was first described by statistician William Sealy Gosset under the pseudonym “Student.”

To calculate a confidence interval of a mean using the critical value of t , follow these four steps:

  • Choose the significance level based on your desired confidence level. The most common confidence level is 95%, which corresponds to α = .05 in the two-tailed t table .
  • Find the critical value of t in the two-tailed t table.
  • Multiply the critical value of t by s / √ n .
  • Add this value to the mean to calculate the upper limit of the confidence interval, and subtract this value from the mean to calculate the lower limit.

To test a hypothesis using the critical value of t , follow these four steps:

  • Calculate the t value for your sample.
  • Find the critical value of t in the t table .
  • Determine if the (absolute) t value is greater than the critical value of t .
  • Reject the null hypothesis if the sample’s t value is greater than the critical value of t . Otherwise, don’t reject the null hypothesis .

You can use the T.INV() function to find the critical value of t for one-tailed tests in Excel, and you can use the T.INV.2T() function for two-tailed tests.

You can use the qt() function to find the critical value of t in R. The function gives the critical value of t for the one-tailed test. If you want the critical value of t for a two-tailed test, divide the significance level by two.

You can use the RSQ() function to calculate R² in Excel. If your dependent variable is in column A and your independent variable is in column B, then click any blank cell and type “RSQ(A:A,B:B)”.

You can use the summary() function to view the R²  of a linear model in R. You will see the “R-squared” near the bottom of the output.

There are two formulas you can use to calculate the coefficient of determination (R²) of a simple linear regression .

R^2=(r)^2

The coefficient of determination (R²) is a number between 0 and 1 that measures how well a statistical model predicts an outcome. You can interpret the R² as the proportion of variation in the dependent variable that is predicted by the statistical model.

There are three main types of missing data .

Missing completely at random (MCAR) data are randomly distributed across the variable and unrelated to other variables .

Missing at random (MAR) data are not randomly distributed but they are accounted for by other observed variables.

Missing not at random (MNAR) data systematically differ from the observed values.

To tidy up your missing data , your options usually include accepting, removing, or recreating the missing data.

  • Acceptance: You leave your data as is
  • Listwise or pairwise deletion: You delete all cases (participants) with missing data from analyses
  • Imputation: You use other data to fill in the missing data

Missing data are important because, depending on the type, they can sometimes bias your results. This means your results may not be generalizable outside of your study because your data come from an unrepresentative sample .

Missing data , or missing values, occur when you don’t have data stored for certain variables or participants.

In any dataset, there’s usually some missing data. In quantitative research , missing values appear as blank cells in your spreadsheet.

There are two steps to calculating the geometric mean :

  • Multiply all values together to get their product.
  • Find the n th root of the product ( n is the number of values).

Before calculating the geometric mean, note that:

  • The geometric mean can only be found for positive values.
  • If any value in the data set is zero, the geometric mean is zero.

The arithmetic mean is the most commonly used type of mean and is often referred to simply as “the mean.” While the arithmetic mean is based on adding and dividing values, the geometric mean multiplies and finds the root of values.

Even though the geometric mean is a less common measure of central tendency , it’s more accurate than the arithmetic mean for percentage change and positively skewed data. The geometric mean is often reported for financial indices and population growth rates.

The geometric mean is an average that multiplies all values and finds a root of the number. For a dataset with n numbers, you find the n th root of their product.

Outliers are extreme values that differ from most values in the dataset. You find outliers at the extreme ends of your dataset.

It’s best to remove outliers only when you have a sound reason for doing so.

Some outliers represent natural variations in the population , and they should be left as is in your dataset. These are called true outliers.

Other outliers are problematic and should be removed because they represent measurement errors , data entry or processing errors, or poor sampling.

You can choose from four main ways to detect outliers :

  • Sorting your values from low to high and checking minimum and maximum values
  • Visualizing your data with a box plot and looking for outliers
  • Using the interquartile range to create fences for your data
  • Using statistical procedures to identify extreme values

Outliers can have a big impact on your statistical analyses and skew the results of any hypothesis test if they are inaccurate.

These extreme values can impact your statistical power as well, making it hard to detect a true effect if there is one.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

There are various ways to improve power:

  • Increase the potential effect size by manipulating your independent variable more strongly,
  • Increase sample size,
  • Increase the significance level (alpha),
  • Reduce measurement error by increasing the precision and accuracy of your measurement devices and procedures,
  • Use a one-tailed test instead of a two-tailed test for t tests and z tests.

A power analysis is a calculation that helps you determine a minimum sample size for your study. It’s made up of four main components. If you know or have estimates for any three of these, you can calculate the fourth component.

  • Statistical power : the likelihood that a test will detect an effect of a certain size if there is one, usually set at 80% or higher.
  • Sample size : the minimum number of observations needed to observe an effect of a certain size with a given power level.
  • Significance level (alpha) : the maximum risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.
  • Expected effect size : a standardized way of expressing the magnitude of the expected result of your study, usually based on similar studies or a pilot study.

Statistical analysis is the main method for analyzing quantitative research data . It uses probabilities and models to test predictions about a population from sample data.

The risk of making a Type II error is inversely related to the statistical power of a test. Power is the extent to which a test can correctly detect a real effect when there is one.

To (indirectly) reduce the risk of a Type II error, you can increase the sample size or the significance level to increase statistical power.

The risk of making a Type I error is the significance level (or alpha) that you choose. That’s a value that you set at the beginning of your study to assess the statistical probability of obtaining your results ( p value ).

The significance level is usually set at 0.05 or 5%. This means that your results only have a 5% chance of occurring, or less, if the null hypothesis is actually true.

To reduce the Type I error probability, you can set a lower significance level.

In statistics, a Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s actually false.

In statistics, power refers to the likelihood of a hypothesis test detecting a true effect if there is one. A statistically powerful test is more likely to reject a false negative (a Type II error).

If you don’t ensure enough power in your study, you may not be able to detect a statistically significant result even when it has practical significance. Your study might not have the ability to answer your research question.

While statistical significance shows that an effect exists in a study, practical significance shows that the effect is large enough to be meaningful in the real world.

Statistical significance is denoted by p -values whereas practical significance is represented by effect sizes .

There are dozens of measures of effect sizes . The most common effect sizes are Cohen’s d and Pearson’s r . Cohen’s d measures the size of the difference between two groups while Pearson’s r measures the strength of the relationship between two variables .

Effect size tells you how meaningful the relationship between variables or the difference between groups is.

A large effect size means that a research finding has practical significance, while a small effect size indicates limited practical applications.

Using descriptive and inferential statistics , you can make two types of estimates about the population : point estimates and interval estimates.

  • A point estimate is a single value estimate of a parameter . For instance, a sample mean is a point estimate of a population mean.
  • An interval estimate gives you a range of values where the parameter is expected to lie. A confidence interval is the most common type of interval estimate.

Both types of estimates are important for gathering a clear idea of where a parameter is likely to lie.

Standard error and standard deviation are both measures of variability . The standard deviation reflects variability within a sample, while the standard error estimates the variability across samples of a population.

The standard error of the mean , or simply standard error , indicates how different the population mean is likely to be from a sample mean. It tells you how much the sample mean would vary if you were to repeat a study using new samples from within a single population.

To figure out whether a given number is a parameter or a statistic , ask yourself the following:

  • Does the number describe a whole, complete population where every member can be reached for data collection ?
  • Is it possible to collect data for this number from every member of the population in a reasonable time frame?

If the answer is yes to both questions, the number is likely to be a parameter. For small populations, data can be collected from the whole population and summarized in parameters.

If the answer is no to either of the questions, then the number is more likely to be a statistic.

The arithmetic mean is the most commonly used mean. It’s often simply called the mean or the average. But there are some other types of means you can calculate depending on your research purposes:

  • Weighted mean: some values contribute more to the mean than others.
  • Geometric mean : values are multiplied rather than summed up.
  • Harmonic mean: reciprocals of values are used instead of the values themselves.

You can find the mean , or average, of a data set in two simple steps:

  • Find the sum of the values by adding them all up.
  • Divide the sum by the number of values in the data set.

This method is the same whether you are dealing with sample or population data or positive or negative numbers.

The median is the most informative measure of central tendency for skewed distributions or distributions with outliers. For example, the median is often used as a measure of central tendency for income distributions, which are generally highly skewed.

Because the median only uses one or two values, it’s unaffected by extreme outliers or non-symmetric distributions of scores. In contrast, the mean and mode can vary in skewed distributions.

To find the median , first order your data. Then calculate the middle position based on n , the number of values in your data set.

\dfrac{(n+1)}{2}

A data set can often have no mode, one mode or more than one mode – it all depends on how many different values repeat most frequently.

Your data can be:

  • without any mode
  • unimodal, with one mode,
  • bimodal, with two modes,
  • trimodal, with three modes, or
  • multimodal, with four or more modes.

To find the mode :

  • If your data is numerical or quantitative, order the values from low to high.
  • If it is categorical, sort the values by group, in any order.

Then you simply need to identify the most frequently occurring value.

The interquartile range is the best measure of variability for skewed distributions or data sets with outliers. Because it’s based on values that come from the middle half of the distribution, it’s unlikely to be influenced by outliers .

The two most common methods for calculating interquartile range are the exclusive and inclusive methods.

The exclusive method excludes the median when identifying Q1 and Q3, while the inclusive method includes the median as a value in the data set in identifying the quartiles.

For each of these methods, you’ll need different procedures for finding the median, Q1 and Q3 depending on whether your sample size is even- or odd-numbered. The exclusive method works best for even-numbered sample sizes, while the inclusive method is often used with odd-numbered sample sizes.

While the range gives you the spread of the whole data set, the interquartile range gives you the spread of the middle half of a data set.

Homoscedasticity, or homogeneity of variances, is an assumption of equal or similar variances in different groups being compared.

This is an important assumption of parametric statistical tests because they are sensitive to any dissimilarities. Uneven variances in samples result in biased and skewed test results.

Statistical tests such as variance tests or the analysis of variance (ANOVA) use sample variance to assess group differences of populations. They use the variances of the samples to assess whether the populations they come from significantly differ from each other.

Variance is the average squared deviations from the mean, while standard deviation is the square root of this number. Both measures reflect variability in a distribution, but their units differ:

  • Standard deviation is expressed in the same units as the original values (e.g., minutes or meters).
  • Variance is expressed in much larger units (e.g., meters squared).

Although the units of variance are harder to intuitively understand, variance is important in statistical tests .

The empirical rule, or the 68-95-99.7 rule, tells you where most of the values lie in a normal distribution :

  • Around 68% of values are within 1 standard deviation of the mean.
  • Around 95% of values are within 2 standard deviations of the mean.
  • Around 99.7% of values are within 3 standard deviations of the mean.

The empirical rule is a quick way to get an overview of your data and check for any outliers or extreme values that don’t follow this pattern.

In a normal distribution , data are symmetrically distributed with no skew. Most values cluster around a central region, with values tapering off as they go further away from the center.

The measures of central tendency (mean, mode, and median) are exactly the same in a normal distribution.

Normal distribution

The standard deviation is the average amount of variability in your data set. It tells you, on average, how far each score lies from the mean .

In normal distributions, a high standard deviation means that values are generally far from the mean, while a low standard deviation indicates that values are clustered close to the mean.

No. Because the range formula subtracts the lowest number from the highest number, the range is always zero or a positive number.

In statistics, the range is the spread of your data from the lowest to the highest value in the distribution. It is the simplest measure of variability .

While central tendency tells you where most of your data points lie, variability summarizes how far apart your points from each other.

Data sets can have the same central tendency but different levels of variability or vice versa . Together, they give you a complete picture of your data.

Variability is most commonly measured with the following descriptive statistics :

  • Range : the difference between the highest and lowest values
  • Interquartile range : the range of the middle half of a distribution
  • Standard deviation : average distance from the mean
  • Variance : average of squared distances from the mean

Variability tells you how far apart points lie from each other and from the center of a distribution or a data set.

Variability is also referred to as spread, scatter or dispersion.

While interval and ratio data can both be categorized, ranked, and have equal spacing between adjacent values, only ratio scales have a true zero.

For example, temperature in Celsius or Fahrenheit is at an interval scale because zero is not the lowest possible temperature. In the Kelvin scale, a ratio scale, zero represents a total lack of thermal energy.

A critical value is the value of the test statistic which defines the upper and lower bounds of a confidence interval , or which defines the threshold of statistical significance in a statistical test. It describes how far from the mean of the distribution you have to go to cover a certain amount of the total variation in the data (i.e. 90%, 95%, 99%).

If you are constructing a 95% confidence interval and are using a threshold of statistical significance of p = 0.05, then your critical value will be identical in both cases.

The t -distribution gives more probability to observations in the tails of the distribution than the standard normal distribution (a.k.a. the z -distribution).

In this way, the t -distribution is more conservative than the standard normal distribution: to reach the same level of confidence or statistical significance , you will need to include a wider range of the data.

A t -score (a.k.a. a t -value) is equivalent to the number of standard deviations away from the mean of the t -distribution .

The t -score is the test statistic used in t -tests and regression tests. It can also be used to describe how far from the mean an observation is when the data follow a t -distribution.

The t -distribution is a way of describing a set of observations where most observations fall close to the mean , and the rest of the observations make up the tails on either side. It is a type of normal distribution used for smaller sample sizes, where the variance in the data is unknown.

The t -distribution forms a bell curve when plotted on a graph. It can be described mathematically using the mean and the standard deviation .

In statistics, ordinal and nominal variables are both considered categorical variables .

Even though ordinal data can sometimes be numerical, not all mathematical operations can be performed on them.

Ordinal data has two characteristics:

  • The data can be classified into different categories within a variable.
  • The categories have a natural ranked order.

However, unlike with interval data, the distances between the categories are uneven or unknown.

Nominal and ordinal are two of the four levels of measurement . Nominal level data can only be classified, while ordinal level data can be classified and ordered.

Nominal data is data that can be labelled or classified into mutually exclusive categories within a variable. These categories cannot be ordered in a meaningful way.

For example, for the nominal variable of preferred mode of transportation, you may have the categories of car, bus, train, tram or bicycle.

If your confidence interval for a difference between groups includes zero, that means that if you run your experiment again you have a good chance of finding no difference between groups.

If your confidence interval for a correlation or regression includes zero, that means that if you run your experiment again there is a good chance of finding no correlation in your data.

In both of these cases, you will also find a high p -value when you run your statistical test, meaning that your results could have occurred under the null hypothesis of no relationship between variables or no difference between groups.

If you want to calculate a confidence interval around the mean of data that is not normally distributed , you have two choices:

  • Find a distribution that matches the shape of your data and use that distribution to calculate the confidence interval.
  • Perform a transformation on your data to make it fit a normal distribution, and then find the confidence interval for the transformed data.

The standard normal distribution , also called the z -distribution, is a special normal distribution where the mean is 0 and the standard deviation is 1.

Any normal distribution can be converted into the standard normal distribution by turning the individual values into z -scores. In a z -distribution, z -scores tell you how many standard deviations away from the mean each value lies.

The z -score and t -score (aka z -value and t -value) show how many standard deviations away from the mean of the distribution you are, assuming your data follow a z -distribution or a t -distribution .

These scores are used in statistical tests to show how far from the mean of the predicted distribution your statistical estimate is. If your test produces a z -score of 2.5, this means that your estimate is 2.5 standard deviations from the predicted mean.

The predicted mean and distribution of your estimate are generated by the null hypothesis of the statistical test you are using. The more standard deviations away from the predicted mean your estimate is, the less likely it is that the estimate could have occurred under the null hypothesis .

To calculate the confidence interval , you need to know:

  • The point estimate you are constructing the confidence interval for
  • The critical values for the test statistic
  • The standard deviation of the sample
  • The sample size

Then you can plug these components into the confidence interval formula that corresponds to your data. The formula depends on the type of estimate (e.g. a mean or a proportion) and on the distribution of your data.

The confidence level is the percentage of times you expect to get close to the same estimate if you run your experiment again or resample the population in the same way.

The confidence interval consists of the upper and lower bounds of the estimate you expect to find at a given level of confidence.

For example, if you are estimating a 95% confidence interval around the mean proportion of female babies born every year based on a random sample of babies, you might find an upper bound of 0.56 and a lower bound of 0.48. These are the upper and lower bounds of the confidence interval. The confidence level is 95%.

The mean is the most frequently used measure of central tendency because it uses all values in the data set to give you an average.

For data from skewed distributions, the median is better than the mean because it isn’t influenced by extremely large values.

The mode is the only measure you can use for nominal or categorical data that can’t be ordered.

The measures of central tendency you can use depends on the level of measurement of your data.

  • For a nominal level, you can only use the mode to find the most frequent value.
  • For an ordinal level or ranked data, you can also use the median to find the value in the middle of your data set.
  • For interval or ratio levels, in addition to the mode and median, you can use the mean to find the average value.

Measures of central tendency help you find the middle, or the average, of a data set.

The 3 most common measures of central tendency are the mean, median and mode.

  • The mode is the most frequent value.
  • The median is the middle number in an ordered data set.
  • The mean is the sum of all values divided by the total number of values.

Some variables have fixed levels. For example, gender and ethnicity are always nominal level data because they cannot be ranked.

However, for other variables, you can choose the level of measurement . For example, income is a variable that can be recorded on an ordinal or a ratio scale:

  • At an ordinal level , you could create 5 income groupings and code the incomes that fall within them from 1–5.
  • At a ratio level , you would record exact numbers for income.

If you have a choice, the ratio level is always preferable because you can analyze data in more ways. The higher the level of measurement, the more precise your data is.

The level at which you measure a variable determines how you can analyze your data.

Depending on the level of measurement , you can perform different descriptive statistics to get an overall summary of your data and inferential statistics to see if your results support or refute your hypothesis .

Levels of measurement tell you how precisely variables are recorded. There are 4 levels of measurement, which can be ranked from low to high:

  • Nominal : the data can only be categorized.
  • Ordinal : the data can be categorized and ranked.
  • Interval : the data can be categorized and ranked, and evenly spaced.
  • Ratio : the data can be categorized, ranked, evenly spaced and has a natural zero.

No. The p -value only tells you how likely the data you have observed is to have occurred under the null hypothesis .

If the p -value is below your threshold of significance (typically p < 0.05), then you can reject the null hypothesis, but this does not necessarily mean that your alternative hypothesis is true.

The alpha value, or the threshold for statistical significance , is arbitrary – which value you use depends on your field of study.

In most cases, researchers use an alpha of 0.05, which means that there is a less than 5% chance that the data being tested could have occurred under the null hypothesis.

P -values are usually automatically calculated by the program you use to perform your statistical test. They can also be estimated using p -value tables for the relevant test statistic .

P -values are calculated from the null distribution of the test statistic. They tell you how often a test statistic is expected to occur under the null hypothesis of the statistical test, based on where it falls in the null distribution.

If the test statistic is far from the mean of the null distribution, then the p -value will be small, showing that the test statistic is not likely to have occurred under the null hypothesis.

A p -value , or probability value, is a number describing how likely it is that your data would have occurred under the null hypothesis of your statistical test .

The test statistic you use will be determined by the statistical test.

You can choose the right statistical test by looking at what type of data you have collected and what type of relationship you want to test.

The test statistic will change based on the number of observations in your data, how variable your observations are, and how strong the underlying patterns in the data are.

For example, if one data set has higher variability while another has lower variability, the first data set will produce a test statistic closer to the null hypothesis , even if the true correlation between two variables is the same in either data set.

The formula for the test statistic depends on the statistical test being used.

Generally, the test statistic is calculated as the pattern in your data (i.e. the correlation between variables or difference between groups) divided by the variance in the data (i.e. the standard deviation ).

  • Univariate statistics summarize only one variable  at a time.
  • Bivariate statistics compare two variables .
  • Multivariate statistics compare more than two variables .

The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.

  • Distribution refers to the frequencies of different responses.
  • Measures of central tendency give you the average for each response.
  • Measures of variability show you the spread or dispersion of your dataset.

Descriptive statistics summarize the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalizable to the broader population.

In statistics, model selection is a process researchers use to compare the relative value of different statistical models and determine which one is the best fit for the observed data.

The Akaike information criterion is one of the most common methods of model selection. AIC weights the ability of the model to predict the observed data against the number of parameters the model requires to reach that level of precision.

AIC model selection can help researchers find a model that explains the observed variation in their data while avoiding overfitting.

In statistics, a model is the collection of one or more independent variables and their predicted interactions that researchers use to try to explain variation in their dependent variable.

You can test a model using a statistical test . To compare how well different models fit your data, you can use Akaike’s information criterion for model selection.

The Akaike information criterion is calculated from the maximum log-likelihood of the model and the number of parameters (K) used to reach that likelihood. The AIC function is 2K – 2(log-likelihood) .

Lower AIC values indicate a better-fit model, and a model with a delta-AIC (the difference between the two AIC values being compared) of more than -2 is considered significantly better than the model it is being compared to.

The Akaike information criterion is a mathematical test used to evaluate how well a model fits the data it is meant to describe. It penalizes models which use more independent variables (parameters) as a way to avoid over-fitting.

AIC is most often used to compare the relative goodness-of-fit among different models under consideration and to then choose the model that best fits the data.

A factorial ANOVA is any ANOVA that uses more than one categorical independent variable . A two-way ANOVA is a type of factorial ANOVA.

Some examples of factorial ANOVAs include:

  • Testing the combined effects of vaccination (vaccinated or not vaccinated) and health status (healthy or pre-existing condition) on the rate of flu infection in a population.
  • Testing the effects of marital status (married, single, divorced, widowed), job status (employed, self-employed, unemployed, retired), and family history (no family history, some family history) on the incidence of depression in a population.
  • Testing the effects of feed type (type A, B, or C) and barn crowding (not crowded, somewhat crowded, very crowded) on the final weight of chickens in a commercial farming operation.

In ANOVA, the null hypothesis is that there is no difference among group means. If any group differs significantly from the overall group mean, then the ANOVA will report a statistically significant result.

Significant differences among group means are calculated using the F statistic, which is the ratio of the mean sum of squares (the variance explained by the independent variable) to the mean square error (the variance left over).

If the F statistic is higher than the critical value (the value of F that corresponds with your alpha value, usually 0.05), then the difference among groups is deemed statistically significant.

The only difference between one-way and two-way ANOVA is the number of independent variables . A one-way ANOVA has one independent variable, while a two-way ANOVA has two.

  • One-way ANOVA : Testing the relationship between shoe brand (Nike, Adidas, Saucony, Hoka) and race finish times in a marathon.
  • Two-way ANOVA : Testing the relationship between shoe brand (Nike, Adidas, Saucony, Hoka), runner age group (junior, senior, master’s), and race finishing times in a marathon.

All ANOVAs are designed to test for differences among three or more groups. If you are only testing for a difference between two groups, use a t-test instead.

Multiple linear regression is a regression model that estimates the relationship between a quantitative dependent variable and two or more independent variables using a straight line.

Linear regression most often uses mean-square error (MSE) to calculate the error of the model. MSE is calculated by:

  • measuring the distance of the observed y-values from the predicted y-values at each value of x;
  • squaring each of these distances;
  • calculating the mean of each of the squared distances.

Linear regression fits a line to the data by finding the regression coefficient that results in the smallest MSE.

Simple linear regression is a regression model that estimates the relationship between one independent variable and one dependent variable using a straight line. Both variables should be quantitative.

For example, the relationship between temperature and the expansion of mercury in a thermometer can be modeled using a straight line: as temperature increases, the mercury expands. This linear relationship is so certain that we can use mercury thermometers to measure temperature.

A regression model is a statistical model that estimates the relationship between one dependent variable and one or more independent variables using a line (or a plane in the case of two or more independent variables).

A regression model can be used when the dependent variable is quantitative, except in the case of logistic regression, where the dependent variable is binary.

A t-test should not be used to measure differences among more than two groups, because the error structure for a t-test will underestimate the actual error when many groups are being compared.

If you want to compare the means of several groups at once, it’s best to use another statistical test such as ANOVA or a post-hoc test.

A one-sample t-test is used to compare a single population to a standard value (for example, to determine whether the average lifespan of a specific town is different from the country average).

A paired t-test is used to compare a single population before and after some experimental intervention or at two different points in time (for example, measuring student performance on a test before and after being taught the material).

A t-test measures the difference in group means divided by the pooled standard error of the two group means.

In this way, it calculates a number (the t-value) illustrating the magnitude of the difference between the two group means being compared, and estimates the likelihood that this difference exists purely by chance (p-value).

Your choice of t-test depends on whether you are studying one group or two groups, and whether you care about the direction of the difference in group means.

If you are studying one group, use a paired t-test to compare the group mean over time or after an intervention, or use a one-sample t-test to compare the group mean to a standard value. If you are studying two groups, use a two-sample t-test .

If you want to know only whether a difference exists, use a two-tailed test . If you want to know if one group mean is greater or less than the other, use a left-tailed or right-tailed one-tailed test .

A t-test is a statistical test that compares the means of two samples . It is used in hypothesis testing , with a null hypothesis that the difference in group means is zero and an alternate hypothesis that the difference in group means is different from zero.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

  • Math Article

Alternative Hypothesis

Alternative hypothesis defines there is a statistically important relationship between two variables. Whereas null hypothesis states there is no statistical relationship between the two variables. In statistics, we usually come across various kinds of hypotheses. A statistical hypothesis is supposed to be a working statement which is assumed to be logical with given data. It should be noticed that a hypothesis is neither considered true nor false.

The alternative hypothesis is a statement used in statistical inference experiment. It is contradictory to the null hypothesis and denoted by H a or H 1 . We can also say that it is simply an alternative to the null. In hypothesis testing, an alternative theory is a statement which a researcher is testing. This statement is true from the researcher’s point of view and ultimately proves to reject the null to replace it with an alternative assumption. In this hypothesis, the difference between two or more variables is predicted by the researchers, such that the pattern of data observed in the test is not due to chance.

To check the water quality of a river for one year, the researchers are doing the observation. As per the null hypothesis, there is no change in water quality in the first half of the year as compared to the second half. But in the alternative hypothesis, the quality of water is poor in the second half when observed.

Difference Between Null and Alternative Hypothesis

Basically, there are three types of the alternative hypothesis, they are;

Left-Tailed : Here, it is expected that the sample proportion (π) is less than a specified value which is denoted by π 0 , such that;

H 1 : π < π 0

Right-Tailed: It represents that the sample proportion (π) is greater than some value, denoted by π 0 .

H 1 : π > π 0

Two-Tailed: According to this hypothesis, the sample proportion (denoted by π) is not equal to a specific value which is represented by π 0 .

H 1 : π ≠ π 0

Note: The null hypothesis for all the three alternative hypotheses, would be H 1 : π = π 0 .

what does the word alternative hypothesis mean

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.2 - writing hypotheses.

The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis (\(H_0\)) and an alternative hypothesis (\(H_a\)).

When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the direction of the test (non-directional, right-tailed or left-tailed), and (3) the value of the hypothesized parameter.

  • At this point we can write hypotheses for a single mean (\(\mu\)), paired means(\(\mu_d\)), a single proportion (\(p\)), the difference between two independent means (\(\mu_1-\mu_2\)), the difference between two proportions (\(p_1-p_2\)), a simple linear regression slope (\(\beta\)), and a correlation (\(\rho\)). 
  • The research question will give us the information necessary to determine if the test is two-tailed (e.g., "different from," "not equal to"), right-tailed (e.g., "greater than," "more than"), or left-tailed (e.g., "less than," "fewer than").
  • The research question will also give us the hypothesized parameter value. This is the number that goes in the hypothesis statements (i.e., \(\mu_0\) and \(p_0\)). For the difference between two groups, regression, and correlation, this value is typically 0.

Hypotheses are always written in terms of population parameters (e.g., \(p\) and \(\mu\)).  The tables below display all of the possible hypotheses for the parameters that we have learned thus far. Note that the null hypothesis always includes the equality (i.e., =).

Hypothesis Testing (cont...)

Hypothesis testing, the null and alternative hypothesis.

In order to undertake hypothesis testing you need to express your research hypothesis as a null and alternative hypothesis. The null hypothesis and alternative hypothesis are statements regarding the differences or effects that occur in the population. You will use your sample to test which statement (i.e., the null hypothesis or alternative hypothesis) is most likely (although technically, you test the evidence against the null hypothesis). So, with respect to our teaching example, the null and alternative hypothesis will reflect statements about all statistics students on graduate management courses.

The null hypothesis is essentially the "devil's advocate" position. That is, it assumes that whatever you are trying to prove did not happen ( hint: it usually states that something equals zero). For example, the two different teaching methods did not result in different exam performances (i.e., zero difference). Another example might be that there is no relationship between anxiety and athletic performance (i.e., the slope is zero). The alternative hypothesis states the opposite and is usually the hypothesis you are trying to prove (e.g., the two different teaching methods did result in different exam performances). Initially, you can state these hypotheses in more general terms (e.g., using terms like "effect", "relationship", etc.), as shown below for the teaching methods example:

Depending on how you want to "summarize" the exam performances will determine how you might want to write a more specific null and alternative hypothesis. For example, you could compare the mean exam performance of each group (i.e., the "seminar" group and the "lectures-only" group). This is what we will demonstrate here, but other options include comparing the distributions , medians , amongst other things. As such, we can state:

Now that you have identified the null and alternative hypotheses, you need to find evidence and develop a strategy for declaring your "support" for either the null or alternative hypothesis. We can do this using some statistical theory and some arbitrary cut-off points. Both these issues are dealt with next.

Significance levels

The level of statistical significance is often expressed as the so-called p -value . Depending on the statistical test you have chosen, you will calculate a probability (i.e., the p -value) of observing your sample results (or more extreme) given that the null hypothesis is true . Another way of phrasing this is to consider the probability that a difference in a mean score (or other statistic) could have arisen based on the assumption that there really is no difference. Let us consider this statement with respect to our example where we are interested in the difference in mean exam performance between two different teaching methods. If there really is no difference between the two teaching methods in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean exam performance between the two teaching methods as large as (or larger than) that which has been observed in your sample?

So, you might get a p -value such as 0.03 (i.e., p = .03). This means that there is a 3% chance of finding a difference as large as (or larger than) the one in your study given that the null hypothesis is true. However, you want to know whether this is "statistically significant". Typically, if there was a 5% or less chance (5 times in 100 or less) that the difference in the mean exam performance between the two teaching methods (or whatever statistic you are using) is as different as observed given the null hypothesis is true, you would reject the null hypothesis and accept the alternative hypothesis. Alternately, if the chance was greater than 5% (5 times in 100 or more), you would fail to reject the null hypothesis and would not accept the alternative hypothesis. As such, in this example where p = .03, we would reject the null hypothesis and accept the alternative hypothesis. We reject it because at a significance level of 0.03 (i.e., less than a 5% chance), the result we obtained could happen too frequently for us to be confident that it was the two teaching methods that had an effect on exam performance.

Whilst there is relatively little justification why a significance level of 0.05 is used rather than 0.01 or 0.10, for example, it is widely used in academic research. However, if you want to be particularly confident in your results, you can set a more stringent level of 0.01 (a 1% chance or less; 1 in 100 chance or less).

Testimonials

One- and two-tailed predictions

When considering whether we reject the null hypothesis and accept the alternative hypothesis, we need to consider the direction of the alternative hypothesis statement. For example, the alternative hypothesis that was stated earlier is:

The alternative hypothesis tells us two things. First, what predictions did we make about the effect of the independent variable(s) on the dependent variable(s)? Second, what was the predicted direction of this effect? Let's use our example to highlight these two points.

Sarah predicted that her teaching method (independent variable: teaching method), whereby she not only required her students to attend lectures, but also seminars, would have a positive effect (that is, increased) students' performance (dependent variable: exam marks). If an alternative hypothesis has a direction (and this is how you want to test it), the hypothesis is one-tailed. That is, it predicts direction of the effect. If the alternative hypothesis has stated that the effect was expected to be negative, this is also a one-tailed hypothesis.

Alternatively, a two-tailed prediction means that we do not make a choice over the direction that the effect of the experiment takes. Rather, it simply implies that the effect could be negative or positive. If Sarah had made a two-tailed prediction, the alternative hypothesis might have been:

In other words, we simply take out the word "positive", which implies the direction of our effect. In our example, making a two-tailed prediction may seem strange. After all, it would be logical to expect that "extra" tuition (going to seminar classes as well as lectures) would either have a positive effect on students' performance or no effect at all, but certainly not a negative effect. However, this is just our opinion (and hope) and certainly does not mean that we will get the effect we expect. Generally speaking, making a one-tail prediction (i.e., and testing for it this way) is frowned upon as it usually reflects the hope of a researcher rather than any certainty that it will happen. Notable exceptions to this rule are when there is only one possible way in which a change could occur. This can happen, for example, when biological activity/presence in measured. That is, a protein might be "dormant" and the stimulus you are using can only possibly "wake it up" (i.e., it cannot possibly reduce the activity of a "dormant" protein). In addition, for some statistical tests, one-tailed tests are not possible.

Rejecting or failing to reject the null hypothesis

Let's return finally to the question of whether we reject or fail to reject the null hypothesis.

If our statistical analysis shows that the significance level is below the cut-off value we have set (e.g., either 0.05 or 0.01), we reject the null hypothesis and accept the alternative hypothesis. Alternatively, if the significance level is above the cut-off value, we fail to reject the null hypothesis and cannot accept the alternative hypothesis. You should note that you cannot accept the null hypothesis, but only find evidence against it.

  • More from M-W
  • To save this word, you'll need to log in. Log In

Definition of hypothesis

Did you know.

The Difference Between Hypothesis and Theory

A hypothesis is an assumption, an idea that is proposed for the sake of argument so that it can be tested to see if it might be true.

In the scientific method, the hypothesis is constructed before any applicable research has been done, apart from a basic background review. You ask a question, read up on what has been studied before, and then form a hypothesis.

A hypothesis is usually tentative; it's an assumption or suggestion made strictly for the objective of being tested.

A theory , in contrast, is a principle that has been formed as an attempt to explain things that have already been substantiated by data. It is used in the names of a number of principles accepted in the scientific community, such as the Big Bang Theory . Because of the rigors of experimentation and control, it is understood to be more likely to be true than a hypothesis is.

In non-scientific use, however, hypothesis and theory are often used interchangeably to mean simply an idea, speculation, or hunch, with theory being the more common choice.

Since this casual use does away with the distinctions upheld by the scientific community, hypothesis and theory are prone to being wrongly interpreted even when they are encountered in scientific contexts—or at least, contexts that allude to scientific study without making the critical distinction that scientists employ when weighing hypotheses and theories.

The most common occurrence is when theory is interpreted—and sometimes even gleefully seized upon—to mean something having less truth value than other scientific principles. (The word law applies to principles so firmly established that they are almost never questioned, such as the law of gravity.)

This mistake is one of projection: since we use theory in general to mean something lightly speculated, then it's implied that scientists must be talking about the same level of uncertainty when they use theory to refer to their well-tested and reasoned principles.

The distinction has come to the forefront particularly on occasions when the content of science curricula in schools has been challenged—notably, when a school board in Georgia put stickers on textbooks stating that evolution was "a theory, not a fact, regarding the origin of living things." As Kenneth R. Miller, a cell biologist at Brown University, has said , a theory "doesn’t mean a hunch or a guess. A theory is a system of explanations that ties together a whole bunch of facts. It not only explains those facts, but predicts what you ought to find from other observations and experiments.”

While theories are never completely infallible, they form the basis of scientific reasoning because, as Miller said "to the best of our ability, we’ve tested them, and they’ve held up."

  • proposition
  • supposition

hypothesis , theory , law mean a formula derived by inference from scientific data that explains a principle operating in nature.

hypothesis implies insufficient evidence to provide more than a tentative explanation.

theory implies a greater range of evidence and greater likelihood of truth.

law implies a statement of order and relation in nature that has been found to be invariable under the same conditions.

Examples of hypothesis in a Sentence

These examples are programmatically compiled from various online sources to illustrate current usage of the word 'hypothesis.' Any opinions expressed in the examples do not represent those of Merriam-Webster or its editors. Send us feedback about these examples.

Word History

Greek, from hypotithenai to put under, suppose, from hypo- + tithenai to put — more at do

1641, in the meaning defined at sense 1a

Phrases Containing hypothesis

  • counter - hypothesis
  • nebular hypothesis
  • Whorfian hypothesis
  • null hypothesis
  • planetesimal hypothesis

Articles Related to hypothesis

hypothesis

This is the Difference Between a...

This is the Difference Between a Hypothesis and a Theory

In scientific reasoning, they're two completely different things

Dictionary Entries Near hypothesis

hypothermia

hypothesize

Cite this Entry

“Hypothesis.” Merriam-Webster.com Dictionary , Merriam-Webster, https://www.merriam-webster.com/dictionary/hypothesis. Accessed 23 Mar. 2024.

Kids Definition

Kids definition of hypothesis, medical definition, medical definition of hypothesis, more from merriam-webster on hypothesis.

Nglish: Translation of hypothesis for Spanish Speakers

Britannica English: Translation of hypothesis for Arabic Speakers

Britannica.com: Encyclopedia article about hypothesis

Subscribe to America's largest dictionary and get thousands more definitions and advanced search—ad free!

Play Quordle: Guess all four words in a limited number of tries.  Each of your guesses must be a real 5-letter word.

Can you solve 4 words at once?

Word of the day.

See Definitions and Examples »

Get Word of the Day daily email!

Popular in Grammar & Usage

8 grammar terms you used to know, but forgot, homophones, homographs, and homonyms, commonly misspelled words, how to use em dashes (—), en dashes (–) , and hyphens (-), absent letters that are heard anyway, popular in wordplay, the words of the week - mar. 22, 12 words for signs of spring, 9 superb owl words, 'gaslighting,' 'woke,' 'democracy,' and other top lookups, 10 words for lesser-known games and sports, games & quizzes.

Play Blossom: Solve today's spelling word game by finding as many words as you can using just 7 letters. Longer words score more points.

IMAGES

  1. Null Hypothesis and Alternative Hypothesis

    what does the word alternative hypothesis mean

  2. Hypothesis Testing

    what does the word alternative hypothesis mean

  3. PPT

    what does the word alternative hypothesis mean

  4. Null and Alternative Hypothesis: Symbolic Notation

    what does the word alternative hypothesis mean

  5. Null VS Alternative Hypothesis (Clear Explanation and Difference

    what does the word alternative hypothesis mean

  6. How to Write a Null and Alternative Hypothesis: A Guide with Examples

    what does the word alternative hypothesis mean

VIDEO

  1. What is Hypothesis #hypothesis

  2. 1.5. Hypothesis statement

  3. Research Methods

  4. What is Null Hypothesis Testing Mean? || Academic Research || Ettienne-Murphy

  5. Hypothesis

  6. Difference between null and alternative hypothesis |research methodology in tamil #sscomputerstudies

COMMENTS

  1. Null & Alternative Hypotheses

    When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the population.". The null and alternative are always claims about the population.

  2. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  3. What is an Alternative Hypothesis in Statistics?

    Null hypothesis: µ ≥ 70 inches. Alternative hypothesis: µ < 70 inches. A two-tailed hypothesis involves making an "equal to" or "not equal to" statement. For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches. The null and alternative hypotheses in this case would be: Null hypothesis: µ = 70 inches.

  4. Examples of null and alternative hypotheses

    The alternative hypothesis--that is, the research hypothesis--is the idea, phenomenon, observation that you want to prove. If you suspect that girls take longer to get ready for school than boys, then: ... The mean from this sample, the mean from the sample, is 7.5 hours. Here's their alternative hypothesis. The average amount of sleep students ...

  5. Null and Alternative Hypotheses

    The null and alternative hypotheses are two competing claims that researchers weigh evidence for and against using a statistical test: Null hypothesis (H0): There's no effect in the population. Alternative hypothesis (HA): There's an effect in the population. The effect is usually the effect of the independent variable on the dependent ...

  6. Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0: The null hypothesis: It is a statement about the population that either is believed to be true or is used to put forth an argument unless it can be shown to be incorrect beyond a reasonable doubt.

  7. Alternative hypothesis

    The statement that is being tested against the null hypothesis is the alternative hypothesis. Alternative hypothesis is often denoted as H a or H 1. In statistical hypothesis testing, to prove the alternative hypothesis is true, it should be shown that the data is contradictory to the null hypothesis. Namely, there is sufficient evidence ...

  8. Alternative hypothesis

    The interpretation must be coherent with the alternative hypothesis. The choice between a one-tailed or a two-tailed test needs to be done in such a way that the interpretation of a rejection is always coherent with the alternative hypothesis. Example As in the previous example, consider a test about the mean of a normal distribution, where we ...

  9. Alternative hypothesis

    The alternative hypothesis is one of two mutually exclusive hypotheses in a hypothesis test.The alternative hypothesis states that a population parameter does not equal a specified value. Typically, this value is the null hypothesis value associated with no effect, such as zero.If your sample contains sufficient evidence, you can reject the null hypothesis and favor the alternative hypothesis.

  10. Null and Alternative Hypotheses

    State the null hypothesis, H 0, and the alternative hypothesis. H a, in terms of the appropriate parameter (μ or p). The mean number of years Americans work before retiring is 34. At most 60% of Americans vote in presidential elections. The mean starting salary for San Jose State University graduates is at least 100,000 per year.

  11. 8.1.1: Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

  12. 8.4: The Alternative Hypothesis

    Thus, our alternative hypothesis is the mathematical way of stating our research question. If we expect our obtained sample mean to be above or below the null hypothesis value, which we call a directional hypothesis, then our alternative hypothesis takes the form: HA: μ > 7.47 or HA: μ < 7.47 H A: μ > 7.47 or H A: μ < 7.47.

  13. Alternate Hypothesis in Statistics: What is it?

    The Alternate Hypothesis. The alternate hypothesis is just an alternative to the null. For example, if your null is "I'm going to win up to $1,000" then your alternate is "I'm going to win $1,000 or more.". Basically, you're looking at whether there's enough change (with the alternate hypothesis) to be able to reject the null ...

  14. What Is an Alternative Hypothesis? (Definition and Examples)

    An alternative hypothesis is an opposing theory to the null hypothesis. For example, if the null hypothesis predicts something to be true, the alternative hypothesis predicts it to be false. The alternative hypothesis often is the statement you test when attempting to disprove the null hypothesis. If you can gather enough data to support the ...

  15. Null Hypothesis and Alternative Hypothesis

    Going back to the above example of mean human body temperature, the alternative hypothesis is "The average adult human body temperature is not 98.6 degrees Fahrenheit." If we are studying a new treatment, then the alternative hypothesis is that our treatment does, in fact, change our subjects in a meaningful and measurable way.

  16. What are null and alternative hypotheses?

    What are null and alternative hypotheses? Null and alternative hypotheses are used in statistical hypothesis testing. The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

  17. Alternative Hypothesis-Definition, Types and Examples

    Definition. The alternative hypothesis is a statement used in statistical inference experiment. It is contradictory to the null hypothesis and denoted by H a or H 1. We can also say that it is simply an alternative to the null. In hypothesis testing, an alternative theory is a statement which a researcher is testing.

  18. 5.2

    Alternative Hypothesis. The statement that there is some difference in the population (s), denoted as H a or H 1. When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the direction of the test (non-directional, right-tailed or left-tailed), and (3) the value of the hypothesized parameter.

  19. Alternative Hypothesis in Statistics

    The alternative hypothesis is a hypothesis used in significance testing which contains a strict inequality. A test of significance will result in either rejecting the null hypothesis (indicating ...

  20. Alternative Hypothesis

    The alternative hypothesis in research indicates a connection between the two variables in the study, that is, the dependent and independent variables. On the other hand, the null hypothesis asserts that there is no connection between them. An experimental hypothesis predicts what corresponding change (s) will occur in the dependent variable ...

  21. Hypothesis Testing

    Alternative Hypothesis (H A): The mean exam mark for the "seminar" and "lecture-only" teaching methods is not the same in the population. Now that you have identified the null and alternative hypotheses, you need to find evidence and develop a strategy for declaring your "support" for either the null or alternative hypothesis. We can do this ...

  22. Null Hypothesis: Definition, Rejecting & Examples

    When your sample contains sufficient evidence, you can reject the null and conclude that the effect is statistically significant. Statisticians often denote the null hypothesis as H 0 or H A.. Null Hypothesis H 0: No effect exists in the population.; Alternative Hypothesis H A: The effect exists in the population.; In every study or experiment, researchers assess an effect or relationship.

  23. Hypothesis Definition & Meaning

    hypothesis: [noun] an assumption or concession made for the sake of argument. an interpretation of a practical situation or condition taken as the ground for action.