Dataset: Units
Abstract
This dataset contains standardised information on units.
Executive summary
Data overview
The reference dataset for units contains a collection of units used by other datasets. Data sources in their original state were merged into one single table. If available, identifiers and descriptions of the original datasets are maintained to guaranty their integrity and to enable linkage across datasets.
Data value
The dataset provides a unique identifier for all units to be integrated on the platform. It is required by the EU Pollinator Hub (EUPH) for administrative purposes and fulfils an important role in data standardisation across datasets.
Data description
The dataset contains 1 table. Units and their descriptions were obtained from 3 sources: from the United Nations Statistics Division (UNSD), from the Bureau International des Poids et Mesures (BIPM), from the Statistics Division of the Food and Agriculture Organization of the United Nations (FAOSTAT) and from the Organization for Standardization (ISO).
Data application
This data will be used for the standardisation of data integrated on the EUPH by data providers.
Unresolved issues
Introduction
Standardisation is an important goal of the EU Pollinator Hub (EUPH). Existing standards are prioritised to achieve this goal. Variables are described by numerical facts which we call data. Data is commonly expressed in a unit, which is one of many attributes that may be used to classify data. On the EUPH a unit is treated as metadata. One important goal of the EUPH is to allow users to link data from different sources. It is therefore necessary to provide a standardised description of the data, including units. The general policy of the EUPH is not to modify raw data once it has been prepared for integration. Since the EUPH aims to allow analysis and visualisation of data on the spot, converting the data to a common unit may represent a major challenge, both in terms of usability during the process of data integration and time required for data processing. It has therefor been decided that data will be standardised during the integration process. When data is integrated on the platform, data providers define the unit in which the data to be integrated is expressed from a list of preconfigured set of units (base units and derived units as well as decimal multiples and sub-multiples of the units). If the preconfigured set of units does not meet the requirements of the data provider, the provider has to transform the data to an accepted unit before integration. There is a potentially unlimited number of derived units. Depending on the future management of the EUPH the user might therefore also request the integration of a unit into the EUPH The present dataset contains a preconfigured set of units used by international standardising organisations (SI, ISO United Nations) and the information required for the conversion of units into base units. This allows linkage and visualisation in a variety of units on the EUPH
Material and methods
Data acquisition
Raw data integrated into this dataset comes from 4 different sources on the EUPH
Data preparation
Data was copied into a Microsoft® Excel® worksheet (Microsoft Corporation, Version 2210 Build 16.0.15726.20188). Prepared data was then converted to CSV and imported for profiling into a SQL database (MariaDB foundation, Server-Version 10.4.24) running in a XAMPP environment (BitRock, version 3.3.0). Data was exported from the MySQL database to CSV format using utf-8 coding. Data profiling was performed according to SOP-005 Data profiling using phpMyAdmin (version 5.2.0).
Data validation
None
Data analysis
None
References
There are no sources in the current document.