Dataset: B-GOOD WP1 Tier 2 Data 2022

Dataset from the B-GOOD project, containing data from Tier 2 studies performed in Work Package 1 in 2022
Published: 2025-03-17
Compliance with FAIR* principles
Findable
Accessible
Interoperable
Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles
Data Quality
Requires major revision
Data

Dataset tables

Columns

Table Description Rows Data Points Downloads
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw

Supplemental Files

Any supplemental files, not containing data.

Columns

File Name Description File Details
Licence
This file contains dataset licencing information.
This is a generated file.
Readme
The file contains basic information about the dataset.
This is a generated file.
inspections-PREP_MR_241022.xlsx
n/a
Media type: Document
File size: 692.00 KiB
About

Abstract

The dataset contains data from Tier 2 studies performed in Work Package 1 in 2021 to 2023 in Switzerland, Germany, Finland, Italy and the Neterlands. It contains metadata on study sites (locations of apiaries and honey bee colonies), on test beehives (type and equipment of the beehives, queens), on colonies (condition), on samples taken from the beehives (sample identifiers), on users of software used to acquire data (user consent, user identifiers) and on hive remote sensor data (data availability) as well as data from hive remote sensor (hive weight, hive temperature, ambient temperature). It was published by van Dooremalen C (WR) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Executive summary

Data overview

The data was published by van Dooremalen C (WR) on the [B-GOOD Bee Health Data Portal]https://beehealthdata.org/datasets/024fcb7c-e359-48fb-a432-961065c63afa) as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Data value

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description

n/a

Data application

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains major issues and does not comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate and relevant attributes (*e.g. *metadata that describes the scope of the data has been described, any particularities or limitations about the data that other users should be aware of, specification of the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used, specification of whether it is raw or processed data, explanation of all variable names are explained if they are not self-explanatory) must be provided. It requires major revisions by the data provider.

Unresolved issues

n/a

Introduction

n/a

Material and methods

Data acquisition

All raw data files were downloaded from the website of the B-GOOD Bee Health Data Portal on 2024-10-16 10:57:07.

List of raw data obtained from the data provider.

  1. Archive sensor-data-2021-01-01-2023-01-01-tier-2.zip accessed on 2024-10-16 10:57:07, provided by B-GOOD Bee Health Data Portal
  2. File meta-data-and-inspections-2021-2022-tier-2.xlsx accessed on 2024-10-16 10:57:07, provided by B-GOOD Bee Health Data Portal
  3. File meta-data-and-inspections-2023-tier-2.xlsx accessed on 2024-10-16 10:57:07, provided by B-GOOD Bee Health Data Portal

Data preparation

All files in the zip-archives were extracted using File Explorer (Microsoft Corporation, version 22H2).

All sensor data files stored in the archive sensor-data-2021-01-01-2023-01-01-tier-2.zip were merged using the Python script MergeCsv.py. This script assesses all column names that are being used in a list of selected files and writes the content of selected files in the correct column of a single output file.

File meta-data-and-inspections-2021-2022-tier-2.xlsx and file meta-data-and-inspections-2023-tier-2.xlsx were procesed with MS Excel (Microsoft Corporation, version 2409). Date formats where changed to ISO 8601 format where necessary. The processed raw data files were exported as preparatory files from MS Excel in CSV format (utf-8 encoding) and imported into Notepad++ (version 8.7) where missing values were substituted by {NULL} using regular expressions.

Worksheet Inspections of file meta-data-and-inspections-2021-2022-tier-2.xlsx contains 491 columns. Many of these columns seem to be nested. Files with this kind of data structure require a major transformation before they can be ingested into the EUPH data model. Given that metadata required for such transformation has not been provided, this file was not processed. Instead, the raw data file was uploaded as attached file.

Worksheet Inspections of file meta-data-and-inspections-2021-2022-tier-2.xlsx and meta-data-and-inspections-2023-tier-2 contains 491 columns. Many of these columns seem to be nested. Files with this kind of data structure require a major transformation before they can be ingested into the EUPH data model. Given that metadata required for such transformation has not been provided, this file was not processed. Instead, the raw data file was uploaded as attached file.

Data validation

No data validation was performed.

Data analysis

No data analysis was performed.

References

  1. Dooremalen C. 2024 WP1 Tier 2 data 2022 B-GOOD. B-GOOD Bee Health Data Portal. [2024-10-17] beehealthdata.org
Issues
Unresolved quality issues for hive sensors
  1. For some columns it is possible to guess the meaning based on the context (columns Battery voltage (V) (bv), Weight (kg) (weightkg), T 0 (°C) (t0), T 1 (°C) (t1)). However, for most columns it is unclear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data. Alternatively, all sensor outputs that are not considered necessary could be removed.
  2. It is unclear in which unit the received signal strength is given: dBm or RSSI.
  3. There are weight measurements which are unrealistically low or high. The data should be checked for outliers. The data should be checked for outliers. If the issue can be resolved it should be resolved. If it cannot be resolved it should be documented in the dataset report and mentioned in the description of the table.
  4. There are temperature measurements which are unrealistically low or high. The data should be checked for outliers. The data should be checked for outliers. If the issue can be resolved it should be resolved. If it cannot be resolved it should be documented in the dataset report and mentioned in the description of the table.
Unresolved quality issues for apiaries 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for apiaries 2023
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for hives 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Color (HEX), Queen fertilized, Brood layers, Honey layers, Frames total, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for hives 2023
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Color (HEX), Queen fertilized, Brood layers, Honey layers, Frames total, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor devices 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns boot_count, measurement_transmission_ratio, ble_pin, next_downlink_message, last_downlink_result, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor devices 2023
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns boot_count, measurement_transmission_ratio, ble_pin, next_downlink_message, last_downlink_result, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sample codes 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Test type, Test date, Test result, Test lab name, Updated at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor definitions 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns offset, multiplier, input_measurement, output_measurement, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor definitions 2023
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns offset, multiplier, input_measurement, output_measurement, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor flashlogs 2021-2022
  1. Although the meaning of some columns can be guessed in context, no header has been provided and the meaning of most columns is therefore unclear. Headers should be provided and explained in order to allow reuse of the data.
Unresolved quality issues for sensor flashlogs 2023
  1. Although the meaning of some columns can be guessed in context, no header has been provided and the meaning of most columns is therefore unclear. Headers should be provided and explained in order to allow reuse of the data.
Unresolved quality issues for user consents 2021-2022
  1. Although the meaning of many columns can be guessed in context, the meaning of column Consent (0=no, 1=yes) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for user consents 2023
  1. Although the meaning of many columns can be guessed in context, the meaning of column Consent (0=no, 1=yes) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Properties

Unique identifier

[BGDWP187.0.0]

EUPH IRI

https://app.pollinatorhub.eu/dataset-discovery/BGDWP187.0.0

Status

Quality Validated

Peer review

No peer review was requested.

DOI

No DOI available.

Published

2025-03-17

Access rights

Open

Keywords

colonies, hive sensor, honey bee

Regions, the data was collected in

Deutschland, Italia, Nederland, Schweiz/Suisse/Svizzera, Suomi/Finland
Citation
B-GOOD Bee Health Data Portal Dataset from the B-GOOD project, containing data from Tier 2 studies performed in Work Package 1 in 2022. EU Pollinator Hub. [2026-02-24] app.pollinatorhub.eu
Share
Contact
No public contact details available.
You need to login in order to be able to send a direct message to Author
Dataset rating
You need to be registered in order to give a rating. No ratings available yet.
Metrics

Total views

412

Total downloads

62