Dataset: B-GOOD Weather Data

Dataset from the B-GOOD project, containing the relevant weather data at the study sites.
Published: 2025-03-17
Compliance with FAIR* principles
Findable
Accessible
Interoperable
Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles
Data Quality
Requires major revision
Data

Dataset tables

Download entire dataset
Raw format

Columns

Supplemental Files

Any supplemental files, not containing data.

Columns

Description File Details
Dataset Report
This file contains in detail the structure of the dataset.
This is a generated file.
Licence
This file contains dataset licencing information.
This is a generated file.
Readme
The file contains basic information about the dataset.
This is a generated file.
meta-data-weather-tier-1-b-good.xlsx
Metadata describing the raw data.
Media type: Document
File size: 186.62 KiB
About

Abstract

The dataset contains data from weather stations in the vicinity of the test apiaries located in Belgium, Switzerland, Germany, France, the Netherlands, Portugal, Romania and United Kingdom in 2020 and 2021. It was published by van Dooremalen C (WR) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Executive summary

Data overview

The data was published by van Dooremalen C (WR) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. The dataset contains...

Data value

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description

n/a

Data application

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains major issues and does not comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate and relevant attributes (*e.g. *metadata that describes the scope of the data has been described, any particularities or limitations about the data that other users should be aware of, specification of the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used, specification of whether it is raw or processed data, explanation of all variable names are explained if they are not self-explanatory) must be provided. It requires major revisions by the data provider.

Unresolved issues

n/a

Introduction

n/a

Material and methods

Data acquisition

All raw data files were downloaded from the B-GOOD Bee Health Data Portal on 2024-10-11.

List of raw data obtained from the data provider.

  1. Archive weather-data-2020.zip accessed on 2024-10-11 06:33:06, provided by B-GOOD Bee Health Data Portal
  2. Archive weather-data-2021.zip accessed on 2024-10-11 06:33:06, provided by B-GOOD Bee Health Data Portal
  3. File meta-data-weather-tier-1-b-good.xlsx accessed on 2024-10-11 06:33:06, provided by B-GOOD Bee Health Data Portal

Metadata was obtained from the dataset's web page.

Data preparation

All files in the zip-archives were extracted using File Explorer (Microsoft Corporation, version 22H2).

Each raw data file was imported into MS Excel (Microsoft Corporation, version 2409) where a first assessment of the existing data was made. Based on this assessment a data mapping file was constructed in which each column in the raw data files was assigned to a column with a common name (header), definition, unit and data type, which applied to the presumed content of each single column in the raw data files. The metadatafile meta-data-weather-tier-1-b-good.xlsx was used as a guideline. Subsequently, each data column header in the raw data file was substituted by the relevant common column header.

All processed raw data files were then exported from MS Excel in CSV format (utf-8 encoding) and imported into into a SQL database (MariaDB foundation, server version 10.4.32) running in an XAMPP environment (BitRock, version 5.2.1). Depending on the year of data acquisition, the record were then divided into one table that contained only data from 2020 and one table that contained data from 2021 (including 2 records from 2022). Each table was then exported to the preparatory files weather 2020PREPMR241014.csv and weather 2021PREPMR241014.csv, respectively, which were subsequently imported into the EU Pollinator Hub.

Data was then exported to the respective preparatory files and uploaded to the EU Pollinator Hub according to SOP-017 (Dataset integration.

Data validation

No data validation was performed.

Data analysis

No data analysis was performed.

References

  1. Dooremalen C. 2024 WP1 Tier 1 weather data B-GOOD. B-GOOD Bee Health Data Portal. [2024-10-28] beehealthdata.org
Discussion

Discussion (0)

Log in to comment!

Issues
Unresolved quality issues for data 2021 all countries
  1. In columns temperature, temperature_min, temperature_max, feelslike and dew_point from the device located in Romania 9.095 records contain values > 100°C. These records must be revised by the data provider.
  2. The description of the data (metadata) is largely inclomplete and allows no clear standardisation of the data.
    • For column temperature ist is unclear, how the temperature is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column feelslike it is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For column dew_point it is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For column RH ist is unclear, how the relative humidity is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column atmpressure_pa and atmpressureh_pa it is unclear, how the atmospheric pressure is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval)
    • For data reported in column atmpressure_sealevel_Pa and atmpressure_sealevel_hPa is is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For data reported in column atmpressure_grndlevel_hPa is is unclear what is reported in the raw data file.
    • For data reported in column rain_counter is is unclear what is reported in the raw data file.
    • For data reported in column rain_max is is unclear what is reported in the raw data file.
    • For data reported in column valid_ticks is is unclear what is reported in the raw data file.
    • For column wind_speed_ms and wind_speed_kmh it is unclear, how the wind speed is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column wind_gust_ms and wind_gust_kmh it is unclear, how the wind gust is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column wind_deg it is unclear, how the wind direction is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column irradiance it is unclear, how the solar irradiance is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column energy density it is unclear, how the rate of solar radiation is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For data reported in column clouds is is unclear what is reported in the raw data file.
    • For data reported in column visibility is is unclear what is reported in the raw data file.
    • For data reported in column carbon_dioxide is is unclear what is reported in the raw data file. Since this column does not contain data in either of the two tables, it could also be deleted.
    • For data reported in column payload is is unclear what is reported in the raw data file.
    • For data reported in column time_sync_error_s is is unclear what is reported in the raw data file.
    • For data reported in column seq_number_modem is is unclear what is reported in the raw data file.
    • For data reported in column seq_number_firmware is is unclear what is reported in the raw data file. Since this column does not contain data in either of the two tables, it could also be deleted.
    • For data reported in column temperature_wetbulb_stull2011_C is is unclear what is reported in the ray data file.
  3. Data in raw data files acquired in 2021 contain
    • 8 quadruplicate records for the same date and time in raw data files from the device in Romania. These records must be revised by the data provider.
    • 76 triplicate records for the same date and time (3 x 23 records in raw data files from the device in Switzerland, 3 x 53 records in raw data files from the device in Romania). These records must be revised by the data provider.
    • 1763 duplicate records for the same date and time (2 x 293 records in raw data files from the device in Switzerland, 2 x 1470 records in a raw data file from the device in Romania). These records must be revised by the data provider.
Unresolved quality issues for data 2020 all countries
  1. In columns temperature, temperature_min, temperature_max, feelslike and dew_point obtained from the device located in Romania 21.694 records contain values > 100°C. These records must be revised by the data provider.
  2. In 3 records from the device in Belgium values are out of range. These records must be revised by the data provider.
  3. The description of the data (metadata) is largely incomplete and allows no clear standardisation of the data.
    • For column temperature it is unclear, how the temperature is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column feelslike it is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For column dewpoint it is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For column RH it is unclear, how the relative humidity is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column atmpressure_pa and atmpressureh_pa it is unclear, how the atmospheric pressure is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For data reported in column atmpressure_sealevel_Pa and atmpressure_sealevel_hPa it is unclear which algorithm was used to calculate the values reported in the raw data files.
    • For data reported in column atmpressure_grndlevel_hPa it is unclear what is reported in the ray data file.
    • For data reported in column rain_counter it is unclear what is reported in the raw data file.
    • For data reported in column rain_max it is unclear what is reported in the raw data file.
    • For data reported in column valid_ticks it is unclear what is reported in the raw data file.
    • For column wind_speed_ms and wind_speed_kmh it is unclear, how the wind speed is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column wind_gust_ms and wind_gust_kmh it is unclear, how the wind gust is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column wind_deg it is unclear, how the wind direction is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column irradiance it is unclear, how the solar irradiance is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For column energy density it is unclear, how the rate of solar radiation is reported in the single raw data files (either measured at any time during this interval (for example, at the very end) or calculated as arithmetic mean from all measurements or from a subset of measurements in this interval).
    • For data reported in column clouds it is unclear what is reported in the raw data file.
    • For data reported in column visibility it is unclear what is reported in the raw data file.
    • For data reported in column carbon_dioxide it is unclear what is reported in the raw data file. Since this column does not contain data in either of the two tables, it could also be deleted.
    • For data reported in column payload it is unclear what is reported in the raw data file.
    • For data reported in column time_sync_error_s it is unclear what is reported in the raw data file.
    • For data reported in column seq_number_modem it is unclear what is reported in the raw data file.
    • For data reported in column seq_number_firmware it is unclear what is reported in the raw data file. Since this column does not contain data in either of the two tables, it could also be deleted.
    • For data reported in column temperature_wetbulb_stull2011_C it is unclear what is reported in the raw data file.
  4. Data in raw data files acquired in 2020 contain:
    • 9 triplicate records for the same date and time (3 x 8 records in raw data files from the device in Switzerland, 3 x 1 record in a raw data file from the device in Romania). These records must be revised by the data provider.
    • 413 duplicate records for the same date and time (2 x 272 records in raw data files from the device in Switzerland, 2 x 141 records in a raw data file from the device in Romania). These records must be revised by the data provider.
Properties

Unique identifier

[BGDWT180.0.0]

EUPH IRI

https://app.pollinatorhub.eu/dataset-discovery/BGDWT180.0.0

Status

Quality Validated

Peer review

No peer review was requested.

DOI

No DOI available.

Published

2025-03-17

Access rights

Open

Keywords

Apis mellifera, honey bee, weather

Regions, the data was collected in

Belgique/België, Deutschland, France, Nederland, Portugal, România, Schweiz/Suisse/Svizzera, United Kingdom
Citation
B-GOOD Bee Health Data Portal 2025 Dataset from the B-GOOD project, containing the relevant weather data at the study sites.. EU Pollinator Hub. [2025-04-25] app.pollinatorhub.eu
Share
Contact
No public contact details available.
You need to login in order to be able to send a direct message to Author
Dataset rating
You need to be registered in order to give a rating. No ratings available yet.
Metrics

Total views

58

Total downloads

7