Introduction
This article breaks down a Python script designed to process geographical data. The script reads port names and country codes from a CSV file, uses the OpenStreetMap (OSM) Nominatim API to geocode these locations, and writes the results to a new CSV file. This type of script is particularly useful for enriching location data with standardized names and additional geographical information.
Script Overview
The script performs the following main tasks:
- Reads input from a CSV file containing port names and country codes
- For each entry, constructs a query to the OSM Nominatim API
- Processes the API response to extract relevant information
- Writes the processed data to a new CSV file
Let's dive into each part of the script in detail.
Importing Required Libraries
import csv
import requests
The script uses two main libraries:
csv
: For reading from and writing to CSV filesrequests
: For making HTTP requests to the OSM API
File Paths and API Configuration
input_file = '/Users/Apple/Downloads/NameCountryCode.csv'
output_file = '/Users/Apple/Downloads/output.csv'
api_url1 = 'https://nominatim.openstreetmap.org/search.php?q='
api_url2 = '&polygon_geojson=1&format=jsonv2'
Here, we define the paths for the input and output CSV files, as well as the base URL for the OSM Nominatim API. The API URL is split into two parts to allow for easy insertion of the query parameters.
Setting Up HTTP Headers
headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip, deflate, br, zstd',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
'Referer': 'https://nominatim.openstreetmap.org/ui/search.html',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36',
'sec-ch-ua': '"Not)A;Brand";v="99", "Google Chrome";v="127", "Chromium";v="127"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"',
}
These headers are set to mimic a web browser when making requests to the API. This is often necessary to avoid being blocked by the server.
Processing the CSV File
with open(input_file, 'r') as infile, open(output_file, 'w', newline='') as outfile:
reader = csv.DictReader(infile)
fieldnames = ['PortName', 'OSRMName', 'OSRMDisplayName', 'CountryCode']
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
This section opens the input and output files, sets up CSV readers and writers, and writes the header row to the output file.
Main Processing Loop
for row in reader:
print(row)
name = row['\ufeffName'].replace(" ", "%20")
countrycode = row['Country Code']
i = 0
if countrycode == 'ES':
i = i + 1
if i == 0:
continue
This loop iterates through each row in the input CSV. It processes the port name (replacing spaces with %20
for URL encoding) and country code. There's a specific check for the country code 'ES' (Spain), though it appears this check might not be functioning as intended.
Making the API Request
api_url = api_url1 + name + '%20' + countrycode + api_url2
response = requests.get(api_url, params={}, headers=headers)
Here, the script constructs the full API URL and sends a GET request to the OSM Nominatim API.
Processing the API Response
if response.status_code == 200:
data = response.json()
cities = [item for item in data if item.get('addresstype') == 'city']
if cities and isinstance(cities, list) and len(data) > 0:
display_name = cities[0].get('display_name', '')
osrmname = cities[0].get('name', '')
writer.writerow({
'PortName': name.replace("%20", " "),
'OSRMName': osrmname,
'OSRMDisplayName': display_name,
'CountryCode': countrycode
})
else:
print(f"No data returned for {name}")
else:
print(f"API call failed for {name}: {response.status_code}")
If the API call is successful, the script processes the JSON response. It filters for results with an 'addresstype' of 'city', and if found, extracts the display name and OSM name. This information is then written to the output CSV file.
Conclusion
This script demonstrates a practical application of using web APIs to enrich geographical data. It showcases how to handle CSV files, make HTTP requests, process JSON responses, and write the results back to a CSV file. While effective, there are areas for potential improvement, such as error handling, rate limiting to respect API usage policies, and optimizing the filtering of results.
For developers working with geographical data, this script provides a solid foundation that can be adapted and expanded for various geocoding needs.