SPAZIODATI data (dpc3-2017)


June 26, 2017

SpazioDati (Italy)


Information about persons and companies are dispersed across a number of sources. Crawlers collect these information and make them available in different formats. The source types span from basic firmographics, to financial, marketing, key persons and services offered.

Data size
Hundreds of Gigabyte.

Number of attributes

There are different type of entities, The number of attributes per entity is different.
As a approximate estimation, on average each entity will have 15 attributes.


    • Data from our corporate Web crawl: websites and contact information collected from the websites, e.g., phones, emails, description, links to social web.
    • Data about companies/legal entities: basic firmographics, directors and managers, locations of companies’ sites, matches to the websites + entities extracted from various textual descriptions.
    • Financial data (e.g. important indicators, ratings)
    • Locations (company headquarters and sites)
    • Keywords extracted from corporate websites

Data Description
Data description is available here .

Data Sample Disclaimer
Several attributes considered sensitive by the data provider are not present in the data sample available as download.
They will be presented and discussed with the selected applicant during the negotiation phase.

Data Format

Data are serialized as json.

Personal Data

The dataset contains pseudonymized data derived from personal data

Synthetic Data

No Synthetic data present

Geographic coverage

Italy and UK

Level of aggregation

Access is at raw level data and no aggregation is perfomermed before the analysis.
Reange of values might be provided instead of actual values for the most sensitive paramenters.

Data access

  • Subject to negotiation