Southwest Airlines Departure Delay Variance t-Test

Eric M. Guerrero

Embry-Riddle Aeronautical University

Southwest Airlines Departure Delay Variance t-Test

Introduction

This paper will be about varying delayed departures from Southwest Airlines. Specially, the different times of departure delays when comparing flights between Monday and Friday from Southwest Airlines. This paper will look at a set of data for a comparison to figure out whether these times are the same or different. This paper will be asking “is the average departures delay from Southwest Airlines the same on Monday and Friday?”

Populations

T-Test need populations to draw the data from. As stated earlier the population this paper will look at are Southwest Airline Flights. More specifically all the Flights from January 2, 2012 through January 6, 2012. These flights will range from several airports; Chicago Midway (MDW), Houston Hobby (HOU), and Dallas (DAL).

Variables

From this research of the population this paper will focus on certain variables pertaining to the research question. The first variable is the departure delay times in minutes. The other variable is the day the departure happened, whether the flight happened on Monday or Friday. These variables will be labeled differently in data collection; departure delay times in minutes is (DepDelayMin), Monday (Day 2), and Friday (Day 6).

Data Collection

The data is provided from a website that has previously collected it. The url is http://www.statcrunch.com/app/index.php?dataid=599130&groupid=789. Measurements of times from this data set is in minutes. Actual times from the set are read in 24 hour clock. There are only a few days recorded from this data and they are labeled differently; Monday (Day 2), Wednesday (Day 3) and Friday (Day 6). Also, there is the flight number which is under the Row column. There are a few other variables listed on this sheet, but they do not pertain to the research question. Now, for sampling this paper will take a random sample of ten flights from the days of Monday and Friday (20 different flights). There are a total of 473 flights on Monday (Day 2) and there are 342 flights on Friday (Day 6). But, for Friday the flights are numbered from 1893 to 2235. So, to keep it simple this paper will use a random number generator (https://www.random.org/) to help generate a sample. For Monday the minimum number is 1 and the maximum is 473. Then generate a number 10 times and copy the flight information corresponding to the number and pertaining to the research question Re-roll any generation that repeat the same flight number. Then for Friday on the generator input 1893 as the minimum and 2235 as the maximum. Then repeat the process like Day 1. The data pertaining to the research question are shown below.

Day 2 Flight #

DepDelayMin

Day 6 Flight #

DepDelayMin

273

0

1963

0

33

5

2206

2

24

0

1911

0

120

0

2168

0

439

5

2122

0

144

12

2095

7

328

33

1914

0

257

0

2177

0

433

25

1952

11

314

5

2102

8

Study Design

With this data a t-Test can now be performed. Since, this data has 2-samples and they are not dependent of each other it can be determined that this will a “2 sample t-Test”. As opposed to the “1-sample t-Test” where there is only one sample of data and the “paired t-Test” where there are two samples that are dependent of one another. The 2 sample t-Test would find whether there is a difference sample means of the two days. From the initial question two hypothesizes can be derived. A null hypothesis (H0) is the predicted outcome was not due to the variables listed earlier and an alternate hypothesis which is the predicted outcome was due to the variables listed earlier. The null hypothesis is that Day 2 and Day 6 have an equal mean departure delays (H0: ?1- ?2 = 0). The alternate hypothesis is that Day 2 and Day 6 have different mean departure delays (HA: ?1- ?2 ? 0). This test is not to see whether or not the means are great or less than. So, if this test were to be graphed it would be one-tailed test.

Results

Now, with the data sampled and the study defined the t-Test can be performed. The sampled data is then reorganized to be read more clearly. From this new set it can be graphed.

Day 2

0

0

0

0

5

5

5

12

25

33

Day 6

0

0

0

0

0

0

2

7

8

11

Also, the means defined in the graph below (Day 2 Mean; 8.5 and Day 6 Mean; 2.8).

These results can now be placed into the equation, shown below. Where “X1” is the mean of Day

2, 8.5 and “X2” is the mean of Day 6, 2.8. “n” would equal the number samples, which is 20 and “SX1X2” is the standard error. This standard error is derived from the two standard deviations of both days. This number is 3.9017. This can now be finally calculated, but the zero from both hypothesizes can be replaced with the standard error. So, the Null Hypothesis is now H0: ?1- ?2 = 3.9017 and the Alternate is HA: ?1- ?2 = 3.9017. Utilizing the Stat Crunch program the results of the t-Test are as follows;

Hypothesis test results: ?1: Mean of Day 2?2: Mean of Day 6?1 – ?2: Difference between two meansH0: ?1 – ?2 = 3.9017HA: ?1 – ?2 ? 3.9017

Difference

Sample Diff.

Std. Err.

DF

T-Stat

P-value

?1 – ?2

5.7

3.901709

18

0.46090059

0.6504

All of these calculations are within a 95% confidence interval. Meaning that this paper is 95% confidant that if these calculations are performed again, but with a new sampling of 10 from Day 2 and 10 from Day6 that the same results would occur. Additionally, the equation would have a significance level of 0.05.

Findings

Taking the results from the statistics the hypothesis can be answered. The degrees of freedom found is 18. The confidence interval 95% and the significance level is 0.05. The T-Stat was 0.4609 and the P-value is 0.6504. Since, the P-value is far greater the 0.05 the null hypothesis of “that Day 2 and Day 6 have an equal mean departure delays” (H0: ?1- ?2 = 3.9017) cannot be accepted. So, the alternate hypothesis “that Day 2 and Day 6 have different mean departure delays” is more likely true. With these results this paper is 95% confidant that the departure delays of Southwest Airlines are different when comparing Monday to Friday.

Bibliography

ERAU MATH 211/222 Data Sets. “StatCrunch.” StatCrunch. ERAU MATH 211/222 Data Sets,

08 May 2012. Web. 09 Mar. 2016.