MAIN POINTS

Introduction

Statistical inference enables investigators to evaluate the accuracy of their estimates. A second use of inferential statistics is in the assessment of the probability of specific sample results under assumed population conditions. This type of inferential statistics is called hypothesis testing. It leads to the determination of whether sample results characterize the population as a whole or reflect chance occurrences.

The Strategy of Testing Hypotheses

The first step in testing a hypothesis is to formulate it in statistical terms. This procedure involves the following steps:

  1. Formulate a null hypothesis and a research hypothesis.
  2. Choose a sampling distribution and a statistical test according to the null hypothesis.
  3. Specify a significance level and define the region of rejection.
  4. Compute the statistical test, and reject or retain the null hypothesis accordingly.
Null and Research Hypotheses

There are two statistical hypotheses involved in the process of testing hypotheses: the research hypothesis and the null hypothesis. The need for two hypotheses arises out of a logical necessity: the null hypothesis is based on negative inference in order to avoid the fallacy of affirming the consequent-that is, researchers must eliminate false hypotheses rather than accept true ones. Support for a research hypothesis is provided by data that lead to rejection of a null hypothesis.

Sampling Distribution

Having formulated a specific null hypothesis, the investigator proceeds to test it against the sample result. In order to determine the accuracy of the sample statistic, one has to compare it to a statistical model that gives the probability of observing such a result. Such a statistical model is called a sampling distribution and is the theoretical distribution that would result if all possible samples of a given size were drawn and then a statistic (such as a mean, a proportion, or a correlation) were calculated on each sample and arrayed in a frequency distribution. Such a distribution is used, in combination with probability theory, to determine how likely it is that a given sample statistic (the one calculated on the sample actually dealt with) is atypical.

Level of Significance and Region of Rejection

The range of the results in a sampling distribution that are very unlikely to occur is referred to as the region of rejection. The sum of the probabilities of the results included in the region of rejection is denoted as the level of significance. If a calculated statistic is so extreme that it falls in the region of rejection on a sampling distribution, then the researcher rejects the null hypothesis and assumes that the sample result is real rather than due to chance.

A statistical test may be one-tailed or two-tailed. In testing a directional research hypothesis, a one-tailed test should be used. This means that the region of rejection for the null hypothesis falls at only one end of the sampling distribution. In testing a nondirectional research hypothesis, a two tailed test should be used.

A Type I error occurs when a true null hypothesis is rejected. A Type II error occurs when a null hypothesis that is actually false is accepted. The likelihood of a Type I error can be directly controlled because it is determined by the level of significance (alpha) that is utilized.

Parametric and Nonparametric Tests of Significance

Parametric tests of significance are based on assumptions about the parameters of the population from which the sample is drawn. Nonparametric statistical tests are used when such assumptions about population characteristics cannot be reasonably made. There are certain assumptions associated with most nonparametric tests; however, they are weaker and fewer than those associated with parametric tests.

Parametric tests include the difference-between-means test. When the dependent variable being investigated is measured on an interval scale, a comparison of means can be used to reflect the amount of relationship between two variables. The t test is used to test for the significance of differences between sample means. This type of test is used to examine a research hypothesis that the average value of some particular variable differs in two different groups in a population (such as male and female, urban and rural dwellers, etc.). A similar t test can be conducted in order to test a research hypothesis that two variables are correlated in a population.

The chi square test is a nonparametric test used to test the hypothesis that a dependent variable is distributed differently within various conditions of an independent variable. Thus, it provides a significance test of the relationship between the two variables in the population.