What are the different types of hypothesis testing?
By William Brook
Even knowing all the statistics in the world would not help if you make a leap from those statistics to a wrong conclusion. You might be making a multi-million dollar error and do major damages. This is where the hypothesis testing comes in. Real world data is combined with tried and tested analysis tools and the framework allows to test the assumptions and beliefs. This would help to find out how likely something might or might not be within a certain standard of accuracy.
Two hypotheses are created using hypothesis testing:
- The null hypothesis (H_{0}) - This is the assumption that the experimental results have arisen due to chance alone and nothing has influenced the results
- Alternative hypothesis (H_{a}) - A particular outcome is expected to be found
These hypotheses must be mutually exclusive. If one of them is true, the other should be false. Once the null and alternative hypotheses have been determined, they are tested with a sample of an entire population. Once the results have been checked, a conclusion can be drawn on the basis of these results.
The hypothesis testing process consists of the following five steps:
- identifying the question
- determining the significance
- choosing the test
- interpreting the result
- making a decision
What are the different types of hypotheses tests?
There are a number of different types of hypotheses tests, which are useful for different scenarios and data samples. The commonly used hypothesis tests are:
- Normality - The normality test is used for normal distribution in a population sample. There are some basic assumptions made. Normal Description is the most widely known symmetric distribution for continuous data. The symmetrical distribution would never be perfect unless it is an infinite data set. It is also very commonly used in inferential statistics. The peak of the normal curve is an indication of the average of the population and is the centre of process variation. This data is grouped around the mean and it is equally probable of the units to be above or below the mean.
- T-test - The T-tests are used for a normal population distribution where the standard deviation is not known and the sample size is quite small. The Student's T-distribution can be used for the confidence intervals for the mean of the population. The t-distribution is flatter and wider than the z distribution. The t-distribution becomes narrower as the sample size increases and it gradually becomes close to the Normal Distribution. Both of these distributions are symmetric and bell-shaped. The mean is 0 for both. The more the degrees of freedom, the better. The student t-test is used to compare two population using samples from the two.
- Chi-Square test for Independence - It is used to test for association of significance between the two categorical variables from the population sample. It is typically used with random sampling. The chi-square distribution is not bimodal or levelled. It has a skewed bell shape. The symmetry of the graph increases with the degrees of freedom.
- Homogeneity of Variance (HOV) - This test tests the similarity of the dispersion parameters in the population samples, two or more.
- Analysis of Variance (ANOVA) - The ANOVA tests and analyzes the differences between the means in the different groups. It is used in similar ways to a t-test but includes more than 2 groups. All variation is accounted for in this analysis. The equality of the sample means can be tested by comparing the sample variances. The within treatments and between treatments variation can be totalled. The error variance is the repeatability when discussing ANOVA. On the other hand, the technical variance is the reproducibility.
In ANOVA, the fundamental equation is that the total sum of the squares of deviations from the grand mean would be equal to the sum of squares of deviations among the treatments and the grand mean and sum of squares of deviation within the treatments. The significance of this testing method is that the equality of the sample means can be tested by comparing the sample means. Let us look at the types of ANOVA:
- One-way - It uses only one measurer and measures the single factor from multiple sources
- Two-way (without replicates) - It uses only one technician if the measures are not one of the factors and then measures the 2 factors.
- Two-way (with replicates) - The number of technicians is the same but it measures 2 factors but there would be multiple repetitions of each combination.
The procedure of ANOVA proceeds by randomly selecting parts and processes. Then parts, technicians and processes are identified. Following that, the data is collected. The states H0 and H1 are selected next. Next, you would need to choose alpha and calculate the F statistic. When all of that is done, the F-alpha in the table is found and it is tested with the F cal.
- Mood's Median - It would compare the medians of two or more populations.
- Welch's T-test - It tests for the equality of means between the two population samples. It is also known as Welch' unequal variances t-test.
- Kruskal-Wallis H Test - It is used to compare two or more groups that have an independent variable on the basis of a dependent variable. It is also known as the one-way ANOVA on ranks.
- Box-Cox Power Transformation - It transforms a data set into the normal distribution. It is a simple calculation that might help the dataset follow a normal distribution.
These are the different types of hypothesis testing that are used in sampling. The hypothesis tests would commonly be referred according to their 'tails' and the critical regions that exist within them. The three basic types could be left-tailed, right-tailed and two-tailed. One aspect of hypothesis testing that could be confusing is which one to use out of all these. Knowing the tests and learning where they can be applied would offer proper insight into this. The different hypothesis tests can then be run on the collected data samples.