Data Source: Cars API

This API provides a comprehensive dataset on various car models from leading automobile manufacturers, offering detailed insights into the characteristics of vehicles currently being produced. I chose this as a source for data collection due to the doucmentation provided, the ease of use, and the metrics that the API call provides. It seemed like a great starting off point to have this record data at hand.

We start off by importing the required python libraries:

Show the code
import requests
import json

For one car model:

The API operates by accepting the model of a specific car brand and then provides detailed information about that model. Initially, I tested it with a single car model to evaluate the structure of the output

Show the code
model = 'camry'

api_url = 'https://api.api-ninjas.com/v1/cars?model={}'.format(model)
response = requests.get(api_url, headers={'X-Api-Key': 'rtsVUhXvel9N968aGEoMeg==zciz73lsmuFNEOQM'})
if response.status_code == requests.codes.ok:
    print(response.text)
else:
    print("Error:", response.status_code, response.text)
[{"city_mpg": 18, "class": "midsize car", "combination_mpg": 21, "cylinders": 4, "displacement": 2.2, "drive": "fwd", "fuel_type": "gas", "highway_mpg": 26, "make": "toyota", "model": "camry", "transmission": "a", "year": 1993}, {"city_mpg": 19, "class": "midsize car", "combination_mpg": 22, "cylinders": 4, "displacement": 2.2, "drive": "fwd", "fuel_type": "gas", "highway_mpg": 27, "make": "toyota", "model": "camry", "transmission": "m", "year": 1993}, {"city_mpg": 16, "class": "midsize car", "combination_mpg": 19, "cylinders": 6, "displacement": 3.0, "drive": "fwd", "fuel_type": "gas", "highway_mpg": 22, "make": "toyota", "model": "camry", "transmission": "a", "year": 1993}, {"city_mpg": 16, "class": "midsize car", "combination_mpg": 18, "cylinders": 6, "displacement": 3.0, "drive": "fwd", "fuel_type": "gas", "highway_mpg": 22, "make": "toyota", "model": "camry", "transmission": "m", "year": 1993}, {"city_mpg": 18, "class": "midsize-large station wagon", "combination_mpg": 21, "cylinders": 4, "displacement": 2.2, "drive": "fwd", "fuel_type": "gas", "highway_mpg": 26, "make": "toyota", "model": "camry wagon", "transmission": "a", "year": 1993}]

The json output shows fields that could be very useful in extracting meaningful data points

Multiple car models:

In this attempt the API is fed with a longer list of car models from many different OEMs so that I can get a large set of data to work with

Please toggle each code block to learn more about the data collection process.

Show the code
# Lists of car models for each manufacturer
# asked chatgpt to provide model names of the different car brands
# chat gpt prompt: provide multiple models of these car brands

toyota_models = ['Camry', 'Corolla', 'RAV4', 'Highlander', 'Prius', 'Tacoma', 'Tundra', 'Yaris', 'Avalon', '4Runner', 'Supra', 'Sienna', 'Land Cruiser', 'C-HR']
tesla_models = ['Model S', 'Model 3', 'Model X', 'Model Y', 'Roadster', 'Cybertruck', 'Semi']
stla_models = ['Ram 1500', 'Jeep Wrangler', 'Fiat 500', 'Dodge Charger', 'Chrysler Pacifica', 'Alfa Romeo Giulia', 'Maserati Ghibli', 'Lancia Ypsilon', 'Dodge Challenger']
gm_models = ['Chevrolet Silverado', 'GMC Sierra', 'Cadillac Escalade', 'Chevrolet Tahoe', 'Buick Enclave', 'GMC Yukon', 'Chevrolet Camaro', 'Chevrolet Malibu', 'Chevrolet Equinox', 'Cadillac CT5']
nissan_models = ['Altima', 'Rogue', 'Leaf', 'Sentra', 'Pathfinder', 'Murano', 'Maxima', 'Frontier', 'Juke', 'Xterra', 'Versa', 'Armada']
mercedes_models = ['C-Class', 'E-Class', 'S-Class', 'GLC', 'GLE', 'A-Class', 'CLS', 'GLS', 'GLA', 'GLB', 'AMG GT', 'SLC']
bmw_models = ['3 Series', 'X5', 'i3', '5 Series', 'X3', 'X1', '2 Series', 'Z4', '7 Series', 'i8', '1 Series', '6 Series']
porsche_models = ['911', 'Cayenne', 'Taycan', 'Panamera', 'Macan', '718 Boxster', '718 Cayman', 'Carrera GT', 'Panamera']
ford_models = ['F-150', 'Mustang', 'Explorer', 'Focus', 'Fiesta', 'Ranger', 'Escape', 'Bronco', 'Edge', 'Expedition']
audi_models = ['A4', 'A6', 'Q5', 'Q7', 'A3', 'A8', 'Q3', 'TT', 'R8', 'A5', 'Q8']
honda_models = ['Civic', 'Accord', 'CR-V', 'Pilot', 'Fit', 'HR-V', 'Odyssey', 'Ridgeline']
volkswagen_models = ['Golf', 'Passat', 'Tiguan', 'Jetta', 'Arteon', 'Atlas', 'Beetle', 'Polo']
hyundai_models = ['Sonata', 'Elantra', 'Tucson', 'Santa Fe', 'Kona', 'Veloster', 'Ioniq', 'Palisade']
kia_models = ['Sorento', 'Optima', 'Sportage', 'Soul', 'Forte', 'Telluride', 'Stinger', 'Niro']
mazda_models = ['MX-5', 'CX-5', 'Mazda3', 'Mazda6', 'CX-30']
subaru_models = ['Outback', 'Forester', 'Impreza', 'BRZ', 'Ascent']
Show the code
import requests
import json

# Expanded lists of electric and hybrid vehicles
toyota_models += ['Prius Prime', 'RAV4 Prime', 'Mirai']
tesla_models += ['Model 3 Long Range', 'Model 3 Performance', 'Model S Plaid', 'Model X Long Range']
gm_models += ['Chevrolet Bolt EV', 'Chevrolet Bolt EUV', 'GMC Hummer EV', 'Cadillac Lyriq']
nissan_models += ['Ariya', 'Leaf e+', 'e-NV200']
bmw_models += ['i4', 'iX', 'i8 Roadster', '330e', '530e']
mercedes_models += ['EQS', 'EQA', 'EQB', 'EQC', 'EQV']
audi_models += ['e-tron', 'e-tron GT', 'Q4 e-tron', 'A7 Sportback e']
porsche_models += ['Taycan Turbo', 'Taycan Turbo S', 'Taycan 4S', 'Panamera E-Hybrid']
ford_models += ['Mustang Mach-E', 'F-150 Lightning', 'Escape Plug-In Hybrid']
volkswagen_models += ['ID.3', 'ID.4', 'e-Golf', 'Touareg Hybrid']
hyundai_models += ['Kona Electric', 'Ioniq Electric', 'Tucson Plug-In Hybrid']
kia_models += ['Niro EV', 'Soul EV', 'Sorento Plug-In Hybrid']
volvo_models = ['XC40 Recharge', 'C40 Recharge', 'S90 Plug-In Hybrid']
jaguar_models = ['I-PACE']
lucid_models = ['Lucid Air']
rivian_models = ['R1T', 'R1S']

Final json output containing data about several car models that were used while making the API GET request

Show the code
# Combine all lists into one
all_models = toyota_models + tesla_models + stla_models + gm_models + nissan_models + mercedes_models + bmw_models + 
porsche_models + ford_models + audi_models + honda_models + volkswagen_models + hyundai_models + kia_models + mazda_models + 
subaru_models + volvo_models + jaguar_models + lucid_models + rivian_models

# Dictionary to store data for all models
all_data = {}

# Loop through each model and make the API call
for model in all_models:
    api_url = f'https://api.api-ninjas.com/v1/cars?model={model}'
    response = requests.get(api_url, headers={'X-Api-Key': 'rtsVUhXvel9N968aGEoMeg==zciz73lsmuFNEOQM'})

    if response.status_code == requests.codes.ok:
        # Add the response data to the dictionary
        all_data[model] = response.json()
    else:
        print(f"Error for {model}: {response.status_code} {response.text}")

# Convert the dictionary to a JSON string
json_data = json.dumps(all_data, indent=4)

# Write the JSON data to a file
with open('cars.json', 'w') as file:
    file.write(json_data)

print("JSON data saved to car_models_data.json")
JSON data saved to car_models_data.json

Read Data from JSON File and Save as a Dataframe

Show the code
import pandas as pd
import json

# Read the JSON file
file_path = 'cars.json'
with open(file_path, 'r') as file:
    data = json.load(file)

# Convert the JSON data to a pandas DataFrame
df_list = []
for model, records in data.items():
    for record in records:
        # Add the model name to each record
        record['model'] = model
        df_list.append(record)

# Create DataFrame from the list of dictionaries
cars_df = pd.DataFrame(df_list)

cars_df
city_mpg class combination_mpg cylinders displacement drive fuel_type highway_mpg make model transmission year
0 18 midsize car 21 4.0 2.2 fwd gas 26 toyota Camry a 1993
1 19 midsize car 22 4.0 2.2 fwd gas 27 toyota Camry m 1993
2 16 midsize car 19 6.0 3.0 fwd gas 22 toyota Camry a 1993
3 16 midsize car 18 6.0 3.0 fwd gas 22 toyota Camry m 1993
4 18 midsize-large station wagon 21 4.0 2.2 fwd gas 26 toyota Camry a 1993
... ... ... ... ... ... ... ... ... ... ... ... ...
714 80 small sport utility vehicle 76 NaN NaN 4wd electricity 72 jaguar I-PACE a 2021
715 89 small sport utility vehicle 85 NaN NaN 4wd electricity 82 jaguar I-PACE a 2023
716 79 small sport utility vehicle 76 NaN NaN 4wd electricity 72 jaguar I-PACE a 2023
717 74 standard pickup truck 70 NaN NaN 4wd electricity 66 rivian R1T a 2022
718 73 standard sport utility vehicle 69 NaN NaN 4wd electricity 65 rivian R1S a 2022

719 rows × 12 columns

Save Output

Show the code
cars_df.to_csv('cars-data.csv')

Resources

  • https://api-ninjas.com/api/cars
  • https://en.wikipedia.org/wiki/List_of_largest_manufacturing_companies_by_revenue