Sonae supply chain data (DPC1-2017)


June 26, 2017

Sonae (Portugal)


Supply chain data: One huge denormalized table with one line per product flow between locations. These type of datasets, though format specific to Sonae, are general data sets for the retail sector. All the datasets are created in our operational systems, collected in our on premises data warehouse, and made available to 3rd parties through Amazon AWS S3/Redshift.

Industry sector


Data Provider Country



The dataset used in the experiment will have a bespoken update frequency to be decided with the challenge winner.

Dataset Size

20TB of compressed data (1/10 ratio)

Number of attributes


Format and storage

Csv files stored in Amazon AWS


  • Supply chain data – One denormalized table with one line per product flow between locations

Personal data

No data relating to persons present

Synthetic Data

No Synthetic data present

Geographic coverage


Timespan & Production

Timespan: Jan 2017 – present
Production: live

Level of aggregation

Raw data

Data access

Bulk download