Data

We provide the following (logical) data sets:

  1. flights list information
  2. (portions of) trajectories for the flights in the list
  3. fuel burnt/flown for one or more intervals in each flight
  4. airport coordinates

In an ER diagram

erDiagram
    FUEL_READING {
    }
    FLIGHT {
    }
    AIRPORT {
    }
    TRAJECTORY_POSITION {
    }

    
    FUEL_READING }|--||  FLIGHT : has
    FLIGHT       }o--|{  AIRPORT : has
    FLIGHT       ||--|{  TRAJECTORY_POSITION : has 

Figure 1: Entity-Relationship diagram for PRC Data Challenge.

The temporal extent of the datasets covers the period Apr - Oct 2025, split as follows:

The dataset for final ranking will be completely hidden to participating teams; we will only provide fuel_final_submission.parquet to be filled with the prediction of the relevant model.

Fuel Burnt/Flown

We collected fuel information (via Open-Source Intelligence (OSINT)) for flights between April and September 2025.

Dataset:

  1. fuel_train.parquet: for training
  2. fuel_rank_submission.parquet: for submission (fuel_kg values set to 0 [zero])
  3. fuel_final_submission.parquet: for final ranking (fuel_kg values set to 0 [zero])

The columns of the fuel flown/burnt dataset are (d_ stands for delta):

  • idx: row id
  • flight_id: a unique identifier of the flight in the flight list
  • start: a time instant (Coordinated Universal Time (UTC))
  • end: a time instant (UTC)
  • fuel_kg: fuel burnt between start and end

Flight list

Dataset:

  1. flightlist_train.parquet: for training
  2. flightlist_rank.parquet: for ranking
  3. flightlist_final.parquet: the flight list of the additional flights of the final phase.

The following is the list of columns in the flight list dataset:

Trajectories

Dataset:

  1. flights_train/<flight_id>.parquet: for training
  2. flights_rank/<flight_id>.parquet: for rankining
  3. flights_final/<flight_id>.parquet: the flight trajectories of the additional flights of the final phase.

We have trajectory files for each flight with fuel flow segments. These are (mainly) Automatic Dependent Surveillance–Broadcast (ADS-B) position reports for the flights in the flight list as available in OpenSky Network historical database. These flight trajectories are possibly incomplete, i.e. they can lack portions. We have augmented these trajectories with positional information coming from ACARS; the source column would categorize the origin of the data, either adsb or acars.

Each trajectory is described by (units in square brackets)

  • flight_id: an identifier for the flight (details in the flight list dataset), i.e. prc770822360
  • timestamp: timestamp for the position report [UTC]
  • longitude: longitude in Decimal degrees (DD) in [-180, 180] range
  • latitude: latitude in DD in [-90, 90] range
  • altitude: altitude [ft]
  • groundspeed: ground speed [knots, kt]
  • track: track angle in DD
  • vertical_rate: vertical rate of climb/descent [ft/min]
  • mach: the Mach number (from source = acars)
  • typecode: the ICAO aircraft type, i.e. A21N for the Airbus A321neo
  • TAS: True AirSpeed (from source = acars) [kt]
  • CAS: Calibrated AirSpeed (from source = acars) [kt]
  • source: the origin of the information; it can be adsb or acars

Airports

Dataset: apt.parquet

The airports dataset provides complementary positions as follows:

  • icao: the ICAO code of the airport
  • longitude: the airport longitude in DD in [-180, 180] range
  • latitude: the airport latitude in DD in [-90, 90] range
  • elevation: the airport elevation [ft]

Additional and/or External data

The use of additional and/or external dataset is permitted if open data and documented.

How to access the datasets

Last year’s challenge website has some instructions about how to use Minio to access OpenSky buckets at

https://ansperformance.eu/study/data-challenge/dc2024/data.html#using-minio-client

List of Acronyms

ADEP: Aerodrome of DEParture

ADES: Aerodrome of DEStination

ADS-B: Automatic Dependent Surveillance–Broadcast

DD: Decimal degrees

ICAO: International Civil Aviation Organization

UTC: Coordinated Universal Time