Dataset: B-GOOD WP1 Tier 1 Data 2020

Dataset from the B-GOOD project, containing data from Tier 1 studies performed in Work Package 1 in 2020
Published: 2025-03-17
Compliance with FAIR* principles
Findable
Accessible
Interoperable
Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles
Data Quality
Requires major revision
Data

Dataset tables

Download entire dataset
Raw format

Columns

Description Data Points Downloads
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw
Metadata Raw

Supplemental Files

Any supplemental files, not containing data.

Columns

Description File Details
2-inspection-data-compiled-16032022.xlsx
n/a
Media type: Document
File size: 309.57 KiB
Dataset Report
This file contains in detail the structure of the dataset.
This is a generated file.
Licence
This file contains dataset licencing information.
This is a generated file.
Readme
The file contains basic information about the dataset.
This is a generated file.
About

Abstract

Dataset B-GOOD WP1 Tier 1 Data 2020 collects data from Tier 1 studies performed in Work Package 1 in 2020 in Belgium, Switzerland, Germany, France, United Kingdom, the Neterlands, Portugal and Romania. It contains metadata on study sites (locations of apiaries and honey bee colonies), on test beehives (type and equipment of the beehives, queens), on colonies (condition), on samples taken from the beehives (sample identifiers) and on hive remote sensor data (data availability) as well as data from hive remote sensor (hive weight, hive temperature, ambient temperature). It was published by van Dooremalen C (WR) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. Access is granted based on approved B-GOOD Intentions to Publish.

Executive summary

Data overview

The data was published by van Dooremalen C (WR) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Data value

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description

n/a

Data application

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains maior issues and does not comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate and relevant attributes (*e.g. *metadata that describes the scope of the data has been described, any particularities or limitations about the data that other users should be aware of, specification of the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used, specification of whether it is raw or processed data, explanation of all variable names are explained if they are not self-explanatory) must be provided. It requires major revisions by the data provider.

Unresolved issues

n/a

Introduction

n/a

Material and methods

Data acquisition

All raw data files were downloaded from the website of the B-GOOD Bee Health Data Portal on 2024-10-16 10:43:43.

List of raw data obtained from the data provider.

  1. Archive 3-sensor-data-2020-tier1-bgood-21092022.zip accessed on 2024-10-16 10:43:43, provided by B-GOOD Bee Health Data Portal
  2. File 1-meta-data-compiled-08092022.xlsx accessed on 2024-10-16 10:43:43, provided by B-GOOD Bee Health Data Portal
  3. File 2-inspection-data-compiled-16032022.xlsx accessed on 2024-10-16 10:43:43, provided by B-GOOD Bee Health Data Portal

Data preparation

All files in the zip-archives were extracted using File Explorer (Microsoft Corporation, version 22H2).

All sensor data files stored in the archive 3-sensor-data-2020-tier1-bgood-21092022.zip were merged using the Python script MergeCsv.py. This script assesses all column names that are being used in a list of selected files and writes the content of selected files in the correct column of a single output file.

File 1-meta-data-compiled-08092022.xlsx was procesed with MS Excel (Microsoft Corporation, version 2409). Date formats where changed to ISO 8601 format where necessary. The processed raw data files were exported as preparatory files from MS Excel in CSV format (utf-8 encoding) and imported into Notepad++ (version 8.7) where missing values were substituted by {NULL} using regular expressions.

File 2-inspection-data-compiled-16032022.xlsx contains 8 worksheets, each with a different number of columns (BE: 305; FR: 192; DE: 404; PT, RO: 243; CH: 267; GB: 374; NL:451). Many of these columns seem to be nested. Files with this kind of data structure require a major transformation before they can be ingested into the EUPH data model. Given that metadata required for such transformation has not been provided, this file was not processed. Instead, the raw data file was uploaded as attached file.

Data validation

No data validation was performed.

Data analysis

No data analysis was performed.

References

  1. Dooremalen C. 2024 WP1 Tier 1 data 2020 B-GOOD. B-GOOD Bee Health Data Portal. [2024-10-17] beehealthdata.org
Discussion

Discussion (0)

Log in to comment!

Issues
Unresolved quality issues for hive sensors
  1. For some columns it is possible to guess the meaning based on the context (e.g. columns Battery voltage (V) (bv), Weight (kg) (weightkg), T 0 (°C) (t0), T 1 (°C) (t1)). However, for most columns it is unclear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data. Alternatively, all sensor outzputs that are not considered necessary could be removed.
  2. It is unclear in which unit the receives signal strength is given: dBm or RSSI.
  3. There are weight and temperature measurements that are unrealistically low or high. The data should be checked for outlayers. If the issue can be resolved it should be resolved. If it cannot be resolved it should be documented in the dataset report and mentioned in the description of the table.
Unresolved quality issues for metadata main table
  1. For many columns it is possible to guess the meaning based on the context. However, for some columns (Data size (timestamps), Data length (days), Data size imported (timestamps), Data length imported (days), LoRa transmission ratio (x interval), Data completeness (%), filter (startyear)). Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
  2. In column ID decimal numbers are used as record identifiers. If there are no well-founded reasons to do so it is recommended to use integers, a common practice in relational databases. Otherwise, there is a risk of an error when the data is reused.
  3. In column B-Good mini apiary the attributes "Switserland" and "GB" are used. If there are no well-founded reasons to do so it is recommended to use the English name of these countries (Switzerland, Great Britain), like for the majority of the attributes.
  4. In column Hive_id multiple values have been used, separated by a {+} sign. In relational databases this practice is possible but requests additional operations and may lead to errors. If there are no well-founded reasons to do so it is recommended to save multiple versions of such records with the respective identifiers.
Unresolved quality issues for apiaries
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
  2. In columns Address, Postal code and City question marks are used. If there are no well-founded reasons to do so it is recommended to either use a more meaningful attribute (e.g. 'unknown') or to declare it as missing value by assigning the value NULL or a blank.
Unresolved quality issues for hives
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Color (HEX), Queen fertilized, Brood layers, Honey layers, Frames total, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sample codes
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns Test type, Test date, Test result, Test lab name, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor devices
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns boot_count, measurement_transmission_ratio, ble_pin, next_downlink_message, last_downlink_result, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Unresolved quality issues for sensor definitions
  1. Although the meaning of many columns can be guessed in context, the meaning of many columns (in particular of the columns offset, multiplier, input_measurement, output_measurement, Created at, Deleted at) is not entirely clear. Since it was decided to include this data in the data set, it should be explained in order to allow reuse of the data.
Properties

Unique identifier

[BGDWP181.0.0]

EUPH IRI

https://app.pollinatorhub.eu/dataset-discovery/BGDWP181.0.0

Status

Quality Validated

Peer review

No peer review was requested.

DOI

No DOI available.

Published

2025-03-17

Access rights

Open

Keywords

B-GOOD Mini Apiaries, colonies, hive sensor, honey bee, survival

Data collection years

2020

Regions, the data was collected in

Belgique/België, Deutschland, France, Nederland, Portugal, România, Schweiz/Suisse/Svizzera, United Kingdom
Citation
B-GOOD Bee Health Data Portal 2025 Dataset from the B-GOOD project, containing data from Tier 1 studies performed in Work Package 1 in 2020. EU Pollinator Hub. [2025-04-25] app.pollinatorhub.eu
Share
Contact
No public contact details available.
You need to login in order to be able to send a direct message to Author
Dataset rating
You need to be registered in order to give a rating. No ratings available yet.
Metrics

Total views

69

Total downloads

27