top of page

İmar Şehircilik ve Kentsel Dönüşüm

Herkese Açık·11 Bora Altınışık
Airam Rojas Zerpa
Airam Rojas Zerpa

Download and Explore the Air Passengers Dataset with Python and R


How to Download and Analyze the Air Passengers Dataset




If you are interested in learning how to perform time series analysis in Python, a good way to start is by working with a real-world dataset. In this article, we will show you how to download and analyze the air passengers dataset, which is a widely used dataset in the field of time series analysis. The dataset contains monthly airline passenger numbers from 1949 to 1960 and has been used in various studies to develop forecasting models and analyze the trends and seasonality of the data.


Introduction




What is the air passengers dataset and why use it?




The air passengers dataset consists of the number of passengers (in thousands) who traveled by air between 1949 and 1960. The dataset has 144 observations, with one observation for each month in the period. The data exhibits an upward trend, with a noticeable seasonality pattern. There is also some variability in the data.




download air passengers dataset


Download File: https://www.google.com/url?q=https%3A%2F%2Ft.co%2F8jLsEbUsR3&sa=D&sntz=1&usg=AOvVaw0EmDGSIKRknSW9KQHpaTOm



The air passengers dataset is a good example of a time series, which is a sequence of observations recorded at regular intervals over time. Time series analysis is a branch of statistics that deals with analyzing and modeling time-dependent data. Time series analysis can be useful for understanding the behavior and dynamics of a system, identifying patterns and trends, forecasting future values, and testing hypotheses.


How to download the dataset from different sources




There are several online repositories that offer free and open datasets for data science projects. Some of them are:


  • : A platform for data science competitions and learning resources. Kaggle hosts many datasets on various topics, including the air passengers dataset.



  • : The International Air Transport Association provides passenger traffic and sales data for airlines, airports, and other organizations. IATA's data products offer granular and reliable passenger traffic numbers across all geographic regions.



  • : A collection of datasets that are included with R packages or available online. R Datasets includes the air passengers dataset as part of the datasets package.



  • : A blog that covers topics related to data science, machine learning, and artificial intelligence. Towards Dev also provides tutorials and use cases for various datasets, including the air passengers dataset.



To download the dataset from any of these sources, you can follow these steps:


  • Navigate to the website where the dataset is stored.



  • Find the folder or category where the dataset is stored.



  • Select the dataset you need.



  • Click on the download icon or button.



  • Select the appropriate format (usually CSV) to start an immediate download for the full dataset.



How to import and visualize the dataset in Python




Once you have downloaded the dataset, you can import it into Python using pandas, which is a popular library for data manipulation and analysis. Pandas provides various functions to read different types of files, such as CSV, Excel, JSON, etc. For example, to read a CSV file containing the air passengers dataset, you can use the following code:


import pandas as pd dataset = pd.read_csv("airline-passengers .csv")


This will create a pandas dataframe called dataset, which is a tabular data structure that can store and manipulate data. You can use the head() method to view the first few rows of the dataset:


dataset.head()


This will display something like this:


Month Passengers --- --- 1949-01 112 1949-02 118 1949-03 132 1949-04 129 1949-05 121 You can see that the dataset has two columns: Month and Passengers. The Month column contains the date in the format YYYY-MM, and the Passengers column contains the number of passengers (in thousands) for that month. You can use the info() method to get more information about the dataset, such as the number of rows, columns, data types, and missing values:


dataset.info()


This will display something like this:


How to download air passengers dataset in R


Air passengers dataset Kaggle


IATA passenger traffic and sales data


AirPassengers R package


Time series analysis of air passengers dataset


Air passengers dataset CSV


Download air travel data by month and year


Air passengers dataset Python


Forecasting air passengers using ARIMA model


Air passengers dataset source and description


Download global airline passenger traffic data


AirPassengers data set in RStudio


Visualization of air passengers dataset


Air passengers dataset code and tutorial


Download historical air passenger numbers data


AirPassengers data frame in R


Decomposition of air passengers time series


Air passengers dataset Excel


Download air passenger demand data by region and country


AirPassengers object in R


Seasonality and trend in air passengers dataset


Air passengers dataset SQL


Download air transport passenger statistics by airport and airline


AirPassengers class in R


Exploratory data analysis of air passengers dataset


Air passengers dataset JSON


Download monthly international air passenger traffic data


AirPassengers function in R


Machine learning models for air passengers prediction


Air passengers dataset XML


Download domestic air passenger traffic data by state and city


AirPassengers variable in R


Statistical tests for air passengers time series stationarity


Air passengers dataset SPSS


Download air passenger load factor data by flight and route


AirPassengers vector in R


Autocorrelation and partial autocorrelation of air passengers dataset


Air passengers dataset SAS


Download air passenger revenue data by ticket and fare class


AirPassengers series in R


Differencing and transformation of air passengers dataset


Air passengers dataset MATLAB


Download annual air passenger growth rate data by market and segment


AirPassengers index in R


Box-Jenkins method for air passengers time series modeling


Air passengers dataset STATA


Download air passenger satisfaction data by service and quality attributes


<class 'pandas.core.frame.DataFrame'> RangeIndex: 144 entries, 0 to 143 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Month 144 non-null object 1 Passengers 144 non-null int64 dtypes: int64(1), object(1) memory usage: 2.4+ KB


You can see that the dataset has no missing values and that the Month column is of type object, which means it is stored as a string. However, since we want to perform time series analysis on the data, we need to convert the Month column into a datetime object, which is a special data type that can handle dates and times. To do this, we can use the pd.to_datetime() function and assign the result back to the Month column:


dataset['Month'] = pd.to_datetime(dataset['Month'])


Now, if we check the info() method again, we can see that the Month column is of type datetime64[ns], which means it is a datetime object:


dataset.info()


This will display something like this:


<class 'pandas.core.frame.DataFrame'> RangeIndex: 144 entries, 0 to 143 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Month 144 non-null datetime64[ns] 1 Passengers 144 non-null int64 dtypes: datetime64[ns](1), int64(1) memory usage: 2.4 KB


Next, we want to set the Month column as the index of the dataframe, which means it will be used as the row labels instead of the default numeric index. This will make it easier to access and manipulate the data based on time. To do this, we can use the set_index() method and pass the name of the column we want to use as the index:


dataset = dataset.set_index('Month')


Now, if we check the head() method again, we can see that the Month column is no longer a separate column, but rather the index of the dataframe:


dataset.head()


This will display something like this:


Passengers --- Month 1949-01-01 112 1949-02-01 118 1949-03-01 132 1949-04-01 129 1949-05-01 121 Finally, we want to visualize the data using matplotlib, which is a library for creating plots and graphs in Python. Matplotlib provides various functions to create different types of charts, such as line plots, bar plots, scatter plots, etc. For example, to create a line plot of the Passengers column over time, we can use the following code:


import matplotlib.pyplot as plt plt.plot(dataset['Passengers']) plt.xlabel('Month') plt.ylabel('Passengers') plt.title('Airline Passenger Numbers from 1949 to 1960') plt.show()


This will display a plot like this:



You can see that the plot shows a clear upward trend and a seasonal pattern in the data. The number of passengers increases over time, with peaks in the summer months and dips in the winter months. The plot also shows some fluctuations and variations in the data, which indicate some randomness and noise in the data.


Time Series Analysis of the Air Passengers Dataset




Decomposing the time series into trend, seasonality, and residuals




One of the first steps in time series analysis is to decompose the time series into its co


Hakkında

Gruba hoş geldiniz! Diğer üyelerle bağlantı kurabilir, günce...

Bora Altınışık

bottom of page