Learn moreLearn moreApplied Statistics Handbook

Table of Contents



Hypothesis Testing Basics 

The chain of reasoning and systematic steps used in hypothesis testing that are outlined in this section are the backbone of every statistical test regardless of whether one writes out each step in a classroom setting or uses statistical software to conduct statistical tests on variables stored in a database.

Chain of reasoning for inferential statistics

1.       Sample(s) must be randomly selected

2.       Sample estimate is compared to underlying distribution of the same size sampling distribution

3.       The probability that a sample estimate reflects the population parameter is determined

The four possible outcomes in hypothesis testing





Actual Population Comparison


Null Hyp. True

Null Hyp. False


(there is no difference)

(there is a difference)


Rejected Null Hyp


Type I error



Correct Decision


Did not Reject Null


Correct Decision


Type II Error


(Alpha = probability of making a Type I error)

Regardless of whether statistical tests are conducted by hand or through statistical software, there is an implicit understanding that systematic steps are being followed to determine statistical significance.  These general steps are described on the following page and include 1) assumptions, 2) stated hypothesis, 3) rejection criteria, 4) computation of statistics, and 5) decision regarding the null hypothesis. 

The underlying logic is based on rejecting a statement of no difference or no association, called the null hypothesis.  The null hypothesis is only rejected when we have evidence beyond a reasonable doubt that a true difference or association exists in the population(s) from which we drew our random sample(s). 

Reasonable doubt is based on probability sampling distributions and can vary at the researcher's discretion.  Alpha .05 is a common benchmark for reasonable doubt.  At alpha .05 we know from the sampling distribution that a test statistic will only occur by random chance five times out of 100 (5% probability).  Since a test statistic that results in an alpha of .05 could only occur by random chance 5% of the time, we assume that the test statistic resulted because there are true differences between the population parameters, not because we drew an extremely biased random sample.

When learning statistics we generally conduct statistical tests by hand.  In these situations, we establish before the test is conducted what test statistic is needed (called the critical value) to claim statistical significance.  So, if we know for a given sampling distribution that a test statistic of plus or minus 1.96 would only occur 5% of the time randomly, any test statistic that is 1.96 or greater in absolute value would be statistically significant.  In an analysis where a test statistic was exactly 1.96, you would have a 5% chance of being wrong if you claimed statistical significance.  If the test statistic was 3.00, statistical significance could also be claimed but the probability of being wrong would be much less (about .002 if using a 2-tailed test or two-tenths of one percent; 0.2%).  Both .05 and .002 are known as alpha; the probability of a Type I error.

When conducting statistical tests with computer software, the exact probability of a Type I error is calculated.  It is presented in several formats but is most commonly reported as "p <" or "Sig." or "Signif." or "Significance."  Using "p <" as an example, if a priori you established a threshold for statistical significance at alpha .05, any test statistic with significance at or less than .05 would be considered statistically significant and you would be required to reject the null hypothesis of no difference.  The following table links p values with a constant alpha benchmark of .05:

P <


Probability of Making a Type I Error




5% chance difference is not significant

Statistically significant



10% chance difference is not significant

Not statistically significant



1% chance difference is not significant

Statistically significant



96% chance difference is not significant

Not statistically significant



Copyright 2015, AcaStat Software. All Rights Reserved.