IATA Carrier Codes List

Below post contains Python code to scrape IATA airline codes from the IATA website. It uses libraries such as requests, BeautifulSoup, and csv to fetch, parse, and store the data.

Setup

First, let's import the necessary libraries:

import os
import requests
from time import time
from multiprocessing.pool import ThreadPool
from bs4 import BeautifulSoup
import csv

Scraping IATA Codes

The following code block scrapes IATA codes from multiple pages of the IATA website and saves them to a CSV file:

for ind in range(1, 218):
    start = time()
    print(ind)
    r = requests.get(f"https://www.iata.org/AirlineCodeSearchBlock/Search?currentBlock=314383&currentPage=12572&airline.page={ind}", stream=True)
    soup = BeautifulSoup(r.text)
    table = soup.find("table")
    output_rows = []
    i = 1
    for table_row in table.findAll('tr'):
        if i == 1:
            i += 1
            continue
        columns = table_row.findAll('td')
        output_row = []
        for column in columns:
            output_row.append(column.text)
        output_rows.append(output_row)
        with open('output1.csv', 'a+') as csvfile:
            writer = csv.writer(csvfile)
            writer.writerows(output_rows)
    
    print(f"Time to download: {time() - start}")

This code iterates through 217 pages of IATA airline codes, scrapes the data from each page, and appends it to a CSV file named 'output1.csv'.

Alternative Scraping Method

An alternative method to scrape the data is provided, which extracts text from the 'tbody' tags:

for ind in range(1, 218):
    start = time()
    print(ind)
    r = requests.get(f"https://www.iata.org/AirlineCodeSearchBlock/Search?currentBlock=314383&currentPage=12572&airline.page={ind}", stream=True)
    soup = BeautifulSoup(r.text, "html.parser")
    for tag in soup.findAll('tbody'):
        a = tag.get_text()
        print(a)
        with open('IATACode3.csv', 'a+') as f:
            f.writelines(a)

This method saves the scraped data to a file named 'IATACode3.csv'.

Notes

The script uses the requests library to fetch web pages and BeautifulSoup for HTML parsing.
Data is saved incrementally to CSV files to prevent data loss in case of interruptions.
The script includes basic timing to measure the download time for each page.
Make sure to respect the website's terms of service and robots.txt file when using web scraping scripts.

Remember to install the required libraries (requests, beautifulsoup4) before running this script:

pip install requests beautifulsoup4