Those who travel often have an airport they love to hate.
These airports drag out an already long and uncomfortable day. In reviewing the
flight data for 2013, the airport with the greatest amount of arrival delay was
Chicago O’Hare. When faced with traveling through an airport with lots of
delays what can you do to minimize your chances of being delayed? As a team we
took a look at the data to find some answers.
To get the data on flights we used two sources. The first
was http://www.rita.dot.gov/ and the
second was http://openflights.org/ .
RITA, Research and Innovative Technology Administration, is a site managed by
the United States Government and it provides large data sets pertaining to
transportation. RITA provided the core data about flights, delays, airports,
and much more that we needed for analysis. Unfortunately, some of the data is
kind of cryptic. For example the airports are listed with a three character
airport code. This does not help in knowing the full airport name or in knowing
the exact location of the airport.
This is where the second data set comes in. Openflights.org
provides a dataset with all the location and identification data for all the
airports. By combining the two data sets based on the three character airport
ID we are able to get a more complete understanding of the data.
Once we had the data we needed a way to make sense of it.
First we cleaned the data by removing information we did not need, such as on
time flights. Using the power of pivot tables and strong integration with
MySQL, Excel has helped us to quickly clean & make meaning out of data. The
data is historical and structured simplifying the migration to MySQL for quick
query processing. MySQL also integrated well with our visualization tool of
choice, tableau. The ease of using measures, dimensions, and coming up with
dashboards is what inspired us to use tableau. Also, using Tableau Public, we
were able to publish our dashboards online and obtain JavaScript’s to embed.
With the data prepared we were ready to answer some questions. The
first visualization provides insights into flight delays for routes originating
from Chicago O'Hare International Airport. The darkness of the routes corresponds
to the length of the arrival delay. The pie charts illustrate how each delay
type contributes to the total arrival delay.
First we wanted to know who which airports have the greatest arrival
delay from Chicago. The next visualization shows the top 10 destination
airports from Chicago with maximum delay, where dark orange represents the most
delayed destination – ‘La Guardia’. This could be due to the large number of
flights from Chicago for ‘La Guardia’.
As a passenger there are not a lot of options for
different layovers when traveling to your destination. If you must travel through an airport with
heavy delays what can you do to mitigate the chances of being delayed? This
question drove our next several questions. Starting with when is the best time
of day and day of the week to fly to avoid delays? In analyzing the data, we
plotted delays based on the time of day and day of the week, creating a map of
when to fly. We can see early hours (5a.m. to 8 a.m.) are best to fly while
flight departing 3:00 p.m. to 6 p.m. have most arrivals delays. Also,
Wednesday, Thursday, and Sunday have highest delays in afternoons. Sunday is
day 7.
Code
|
Carrier
|
9E
|
Endeavor Air
|
AA
|
American
|
DL
|
Delta
|
EV
|
ExpressJet
|
MQ
|
American Eagle
|
OO
|
Skywest
|
UA
|
United
|
US
|
USAir
|
VX
|
Virgin America
|
YV
|
Mesa Airlines
|
Carrier issues was the next point of investigation.
Which carrier has the least or most delays? Delta DL has the most cancelations
mostly since it runs Endeavor Air as well.
We then did the same type of analysis on canceled
flights.
We found that La Guardia has the most cancelations
while Los Angeles has the fewest.
We also found that Saturday (6) is the best day to
travel to avoid cancelations.
In conclusion we found that it is best to travel
early in the morning on a Saturday. June is the worst month to fly, Delta has
the most issues, and La Guardia is almost guaranteed to give you issues.