Friday, April 8, 2011

Understanding Type I and Type II Error

When I first learned statistics it took some time before I completely understood the concepts of Type I and Type II errors and what they were in relation to a hypothesis test.   The concept is not difficult to understand once we get past the language of statistics.

First, we have to realize that almost every time we make a decision based on data there is some chance we will make an error.    Unless we are “all knowing”, we have to realize descriptive statistics simply describe the data that we are working with.   Usually we are working with samples of much larger populations of data.   We never know with absolute certainly what the true mean of a large population is.

For a typical hypothesis decision we recall a typical null and alternate hypothesis statement for a two tailed test.

H0: μ = 0   (null hypothesis)
H1: μ ≠ 0   (alternate hypothesis)

The objective of the hypothesis test will be to make a decision about the null and alternate hypothesis statements.   There are only two possible outcomes of this decision.

1)    We reject the null hypothesis
2)    We fail to reject the null hypothesis

The possibility of error comes in because we make this decision regardless of whether the null hypothesis is actually true or false.    This gives us four possible situations that I will describe and present a table for to help with understanding the concept visually.   If you can memorize this table and recreate it.  You will always be able to determine what is meant by Type I and Type II errors.

1)    Type I Error: This is a situation where H0 is true but our statistical test rejects it anyway. Think of this as analogous to convicting an innocent person in our judicial system.   If someone is innocent but was convicted anyway the court has made a Type I error.   In our case the null statement was true, but we rejected it.  We have made an error.   Type I error is synonymous with significance level and is often expressed as a probability value with the Greek letter α or alpha.    Type I error is also known as a false positive error.  An effect was not present but we claimed it was. 

2)    Correct Decision:  Hypothesis testing yields a correct decision in two specific cases.    The first is if H0 is true and we fail to reject the null hypothesis.  To follow our judicial system analogy, a person is innocent and the court finds him not guilty.   This is the correct decision.    The second is if H0 is a false statement and we do reject the null hypothesis.   In our court system a person is truly guilty and the court finds him guilty.  This also is a correct decision.

3)    Type II error:  In this situation H0 is a false statement and we should reject it.  However our hypothesis test leads us to fail to reject the null.   We have made a second error known as Type II.   Still keeping with the court system analysis this is a dangerous situation because a person is truly guilty and yet the jury or court finds him innocent.   In statistics, there was a statistical effect present but we failed to detect it.   The probability of Type II error is referred to as β or beta.  Type II error is also known as a false negative error.   An effect was present but we failed to detect it.


Here is a table that is often presented to show the relationships between type one and type II errors.   As stated previously if you can memorize and recreate this table you will be able to succeed at identifying type I and type II errors. 





H0 is True

H0 is False


Fail to Reject H0

Correct Decision
Confidence Level
(1-α)

Type II error
β

Reject H0

Type I error
Significance Level = α

Correct Decision
Power = (1-β)



Now let’s practice on a couple of situations to help identify the correct type of error.

Problem #1 – A tire company rejected a batch of rubber from their supplier stating that it was out of specification in hardness.   Later the supplier showed that the material was in spec and it was discovered an error was made in hardness analysis on the part of the tire company.   What type of error did the tire company make and why? 

In this case the tire company committed a type I error.   There was no difference in actual hardness vs. the hardness specification for this batch.   The tire company should not have rejected the material.  

Problem #2 – A jet aircraft engine manufacturer has inspected a lot of 100 turbine blades for its prototype jet engine.   The blades are put in production because they have passed all QC checks including a critical balance tolerance.   Later an accident occurs with an engine.      What  type of error was made?

In this case the jet engine manufacturer made a type II error.   The turbine blades were out of balance and they failed to detect the effect. 

            The language of statistics sometimes confounds us with lingo such as Type I , Type II, false positive and false negative.   After practice and committing definitions to memory will help improve understanding about errors in statistical hypothesis testing. 

Saturday, October 16, 2010

Wikipedia - More reliable than you think!

I have worked with a lot of students in Research and Evaluation classes.   Often students tell me that other instructors have told them that Wikipedia is not a valid or reliable source for information and should never be cited as a reference.   I think academics is going overboard on this.   Wikipedia suffered a bad reputation after its startup because it does allow anyone to change information.  Some hoaxes did occur.   Since then they have made changes.   I would challenge anyone to look up a list of facts in Wikipedia and find out just how accurate the information is.     Hands down Wikipedia is fast, free and very accurate.    Certainly for PhD dissertation or Masters Thesis writing you would want peer reviewed journal articles for background information on complex topics. However for day to day fact finding, starting point of research.  Wikipedia is a great tool to have at our fingertips.

Saturday, April 24, 2010

Using CliffNotes.com

One of the first things to learn in a statistics class is that statistics can describe populations or they can describe samples from a population.   Therefore you can calculate, descriptive statistics such as a mean, median or mode for a population or for a sample.   Samples ares simply  a subset or collection of individuals from the total set of individuals of a particular type (population).   Your set theory from Math is a good reminder and Venn Diagrams can be used to show how a sample comes from a much larger population.

Sometimes if you need more information than what is available in your text or schools library use, CliffNotes. 

A post at CliffNotes.com is a good reference for more information on understanding the mean, variance and standard deviation.  It is important to understand that there are two ways to calculate a standard deviation and variance depending on if you are working with a population or a sample.  Read your homework carefully and decide what is the situation you are working with.

Once you are on the CliffNotes.com website search for the term statistics and you should fine at least 30 different articles to help with your understanding of basic and inferential statistics.

Reference:
CliffsNotes.com. Populations and Samples. 24 Apr 2010
http://www.cliffsnotes.com/study_guide/topicArticleId-25951,articleId-25920.html.

Friday, April 23, 2010

Humorous look at Correlation and Causation

Of course your professors have told you that Correlation does not equal causation.   Just because two variables are associated does not imply cause.   Here is a humorous look at how some studies infer cause and effect when they should not.



Here is the source link from youtube.com:
http://www.youtube.com/watch?v=4XAItyUJIB0&feature=related

Do you need help with Statistics Class?

Welcome, this will be the first of a series of posts providing help and resource information to students of research and statistics classes.

If you are taking a business statistics class the following links might be a life saver for you.
  1. Mathwizz.com - Hypothesis Testing
  2. NIST Statistics Handbook
  3. Statsoft Electronic Statistics Textbook
  4. Vasser College - Online Textbook
  5. Table of Critical Values of T
  6. Online Statistical Tables
  7. Hypothesis Testing Roadmap
Not a great deal of help but maybe something here will make the light's go on!

Video Help on Statistics

Loading...