# Statistical Diagram: Cause and Effect Diagrams, Box Plots, Chi-Square, Student’s-T, F-Distributions

• Post last modified:4 May 2023
• Reading time:14 mins read
• Post category:Lean Six Sigma

## Cause and Effect Diagrams

Cause and effect diagrams are also called Fishbone diagrams and Ishikawa diagrams. These diagrams were developed by Dr. Kaoru Ishikawa. These diagrams are used by a project team to organize and graphically display all the knowledge it has related to a particular problem. These diagrams can also be used to graphically display all the possible causes of a given quality problem because there can be several possible causes for any given problem.

An ideal Fishbone diagram contains a lot of sub-causes or small bones. If the diagram has a very limited number of causes and sub-causes, it means that the project team’s understanding of the problem is very limited and superficial. It may also indicate that the team that created the Fishbone diagram needs to consult someone else who might be more closely associated with the problem.

### Types of Cause and Effect Diagrams

Different types of Fishbone diagrams are described as follows:

#### Simple Fishbone Diagram

The Fishbone diagram is a template for preparing a Fishbone diagram. This is the simplest representation of a cause-and-effect diagram.

#### 4S Fishbone Diagram

This diagram is used in the service industry. A 4S Fishbone diagram organizes all the potential causes into four common categories, namely suppliers, systems, surroundings, and skills; hence the name 4S Fishbone diagram.

#### 8P Fishbone Diagram

This diagram is also used in the service industry. This diagram organizes all the potential causes into eight common categories, namely procedures, policies, place, product, people, processes, price, and promotion; hence, the name 8P Fishbone diagram.

#### Man Machines Materials Fishbone Diagram

This diagram is commonly used in the manufacturing industry. This diagram organizes all the potential causes into the following categories: man, materials, machine, methods, measurements, environment, management/money, and maintenance.

#### Design of Experiments Fishbone Diagram

This diagram organizes all the potential causes into the following categories: controllable, uncontrollable, held-constant, and blockable nuisance. This type of Fishbone diagram allows structured brainstorming about the potential factors for a response variable. Identification of these factors helps in designing an experiment.

#### Others

There are other types of Fishbone diagrams also. They include the dispersion analysis Fishbone diagram, production process class Fishbone diagram, cause enumeration Fishbone diagram, etc.

Concept of CEDAC

Now that you are aware of the concept of the Fishbone diagram, you must understand what a CEDAC is. Cause And Effect Diagram with the Addition of Cards (CEDAC) is a variant of the cause and effect diagram.

It was developed by Dr. Ryuji Fukuda of Japan. In CEDAC, cards are distributed to anyone who is associated with the process and they write their ideas on these cards outside the meeting room. Most of the time, these cards provide much more information than provided by the usual cause-and-effect diagram. In CEDAC, the diagram is prepared by placing the cards in place of the bones.

## Box Plots

A box plot is a statistical tool that displays summary statistics for a set of distributions. It is a plot created using the 25th, 50th, and 75th percentiles.

Note that a box plot is used to display information about the range, the median, and the quartiles for a set of observations that are represented alongside a number line.

The 25th percentile is the lower boundary of the box whereas the 75th percentile is the upper boundary of the box. The 50th percentile is the median of the overall data set. The 25th percentile is the median of the observations below the overall median, and the 75th percentile is the median of the observations above the overall median. Note that the vertical line inside the box is the median. The box length is represented by the interquartile range which is the difference between the 75th and 25th percentiles.

Note that the values more than 1.5 box lengths below the 25th percentile and the values more than 1.5 box lengths above the 75th percentile are called outliers and are represented by small circles. Similarly, the values more than 3 box lengths below the 25th percentile and the values more than 3 box lengths above the 75th percentile are called extremes and are represented by asterisks.

Lines are drawn on either end of the box from the 25th and 75th percentile till the respective extreme value. This line is named a whisker, and such a plot is also called the box and whisker plot. The box plot can be studied to reveal important information regarding the observation values. The median value can be used to determine the central tendency or the location. The length of the box or the interquartile range can be used to determine the spread or the variability of the observations.

If the median of the values does not lie in the middle of the box, the examiner can conclude that the observations used to draw the plot are skewed. If the median is closer to the 25th percentile, it is negatively skewed, and if the median is closer to the 75th percentile, it is positively skewed. It must be noted that here we have represented the box plot horizontally. However, it can also be represented vertically as shown in the box plot.

One of the most striking features of box plots is that they are also used for comparing the distribution of various groups of observations. The range, median, central tendency, and variability of all groups of observations can be observed easily in the box plot.

You know that averages and ranges control charts and distributions are used in the Measure phase of the DMAIC process. There are three distributions, namely Chi-Square, Student’s T, and F-distribution that are used to test hypotheses, construct confidence intervals, and compute control limits. These are used in the Analyse phase. Chi-Square, Student’s T, and F Distributions are all based on normal population distributions. We will study these distributions in the next sections.

## Chi-Square

The majority of the Six Sigma characteristics have normal or approximately normal distributions. Certain instances of sample distribution have shown that distributions appear in form of a Chi-Square distribution. Chi-Square is represented as χ2. A reference table is created for the Chi-Square distribution using abscissa values for selected ordinates of the cumulative distribution.

The Chi-Square distribution varies with the quantity ʋ (Upsilon), where ʋ represents a value sample size minus one (n−1). The Chi-Square distribution varies as ʋ varies. The probability distribution function for χ2 is as follows.

Let us draw the distribution for the χ2 function with n = 5 or ʋ = 4.

Check the value of the Chi-Square distribution function for different values of χ2.

## Student’s-T

Student’s T-distribution is also called T-statistic and it was developed by W.S. Gosset whose assumed name was Student. Therefore, this distribution is also called the Student’s-T distribution. T-test is used when the sample size is less than 30. T-distributions are flat and wide. If the sample size is greater than 30, the project team should use the Z-distribution.

T-statistic is also used when the standard deviation of the population is unknown. With the increase in the sample size, the peakedness (tallness) of the distribution increases and it moves towards a normal distribution.

It must be noted that both T- and Z-distributions are symmetrical and bell-shaped. Both distributions have a mean of zero. T-statistic is used for finding confidence intervals for the population means. It is also used to test hypotheses about population means (based on sample data), regression coefficients, and various other statistics used in quality engineering.

The T-test is a hypothesis test that is used to determine if there is actually a difference between the standard and the mean of a particular data set or if it just appears to be so due to random chance. For example, T-Statistic can be used to test whether a supplier who promised an average fill rate of 100 pieces is adhering to the promised supply or not.

Gosset carried out some small-scale experiments and wanted to quantify their results. This motivated him to develop T-statistic. T-statistic is a ratio that has been developed by tabulating its probability integral. The formula for deriving a one-sample T-statistic is.

T-statistic is based on the assumption that the population is normally distributed with mean µ and variance σ2.

X 2 = Mean of the second sample

S1 = Standard deviation of the first sample

S2 = Standard deviation of the second sample

n1 = Sample size of the first data set

n2 = Sample size of the second data set

## F-Distributions

In the preceding section, you saw that the T-test is used to compare a sample mean with a standard or acceptable value or to compare two sample means. F-statistic is used to compare two standard deviations or variances.

Assume that there are two random samples (data sets) that are drawn from a normally distributed population. The standard deviation of these samples is s1 and s2, respectively. The sample size of the data sets may or may not be different. Here, the F-statistic is calculated.

Example

Two machines A and B produce outputs through stable processes. An executive tested 61 and 30 samples from each machine’s output. He found that the variance of machine A’s output is 11 whereas that of machine B is 5. Test the hypothesis that both variances are equal.

Now, we will check the value of the F-statistic using the F-distribution table for the degree of freedom of the numerator and denominator.

Degree of freedom of the numerator = Sample size – 1 = 61 – 1 = 60

Degree of freedom of the denominator = Sample size – 1 = 30 – 1 = 29

Value of F-statistic = F (60, 29) = 2.23

Compare this value with the calculated value. It shows that the variances calculated by the executive are almost correct.

Article Source
• Pyzdek, T., & Keller, P. (2010). Six Sigma Handbook (3rd ed.). New York, USA: McGraw-Hill Professional Publishing.

• Barker, T., & Milivojevich, A. Quality by experimental design (4th ed.). Boca Raton, Florida: CRC Press.

• statisticshowto (2023) F Test: Simple Definition, Step by Step Examples. Retrieved 01 March 2023, from https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/f-test/