Dataset: B-GOOD Virus Sequences

Dataset from the B-GOOD project, containing GenBank accession numbers for virus sequences.
Published: 2025-03-17
Compliance with FAIR* principles
Findable
Accessible
Interoperable
Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles
Data Quality
Good
Data

Supplemental Files

Any supplemental files, not containing data.

Columns

File Name Description File Details
Dataset Report
This file contains in detail the structure of the dataset.
This is a generated file.
Licence
This file contains dataset licencing information.
This is a generated file.
Readme
The file contains basic information about the dataset.
This is a generated file.
About

Abstract

Dataset containing the GenBank accession for virus sequences used to quantify various viruses in honey bees (ABPV, CBPV, DWV) in honey bee samples. It was published by Bonjour-Dalmon A (INRAE) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Executive summary

Data overview

The data was published by Bonjour-Dalmon A (INRAE) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. It contains the GenBank accession numbers for virus sequences used to quantify various viruses (ABPV, CBPV DWV) in honey bee samples.

Data value

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description

n/a

Data application

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains major issues and does not fully comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate explanations of all variable names must be provided.

Unresolved issues

n/a

Introduction

n/a

Material and methods

Data acquisition

All raw data files were downloaded from the B-GOOD Bee Health Data Portal on 2024-09-26 18:16:30.

List of raw data obtained from the data provider.

  1. File Accession_Numbers-GENBANK.xlsx, accessed on 2024-09-26 18:16:30, provided by B-GOOD Bee Health Data Portal

Metadata was obtained from the dataset's web page.

Data preparation

The file in the zip-archives was extracted using File Explorer (Microsoft Corporation, version 22H2).

The file Accession_Numbers-GENBANK.xlsx was opened with MS Excel (Microsoft Corporation, version 2409). The worksheets were exported to data files in CSV format (UTF-8 encoding) and imported into Notepad++ (version 8.7) where missing values were substituted by {NULL} using regular expressions. Dates were parsed to the required YYYY-MM-DD format using the python script ParseDates.py.

Data was then exported to the respective preparatory files and uploaded to the EU Pollinator Hub according to SOP-017 (Dataset integration.

Data validation

No data validation was performed.

Data analysis

No data analysis was performed.

References

  1. Bonjour-Dalmon A. 2023 Genbank Accession numbers for virus sequences. B-GOOD Bee Health Data Portal. [2024-11-2] beehealthdata.org
Issues
Unresolved quality issues for ABPV
  1. For column Genbank submission ID it may be guessed, but it is not explicitly stated what it describes. In particular, the pathway through which the sequence was submitted should be explicitly specified. The data provider is requested to make this information available.
  2. For column Publication date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  3. For column Sequence_ID it is unclear what it describes. The data provider is requested to make this information available.
  4. For column Sequence region it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  5. For column Size it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  6. For column Isolate it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  7. For column Host it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  8. For column Country it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  9. For column Colection_date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
Unresolved quality issues for CBPV
  1. For column Genbank submission ID it may be guessed, but it is not explicitly stated what it describes. In particular, the pathway through which the sequence was submitted should be explicitly specified. The data provider is requested to make this information available.
  2. For column Publication date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  3. For column Sequence_ID it is unclear what it describes. The data provider is requested to make this information available.
  4. For column Sequence region it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  5. For column Size it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  6. For column Isolate it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  7. For column Host it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  8. For column Country it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  9. For column Colection_date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
Unresolved quality issues for DWV
  1. For column Genbank submission ID it may be guessed, but it is not explicitly stated what it describes. In particular, the pathway through which the sequence was submitted should be explicitly specified. The data provider is requested to make this information available.
  2. For column Publication date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  3. For column Sequence_ID it is unclear what it describes. The data provider is requested to make this information available.
  4. For column Sequence region it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  5. For column Size it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  6. For column Isolate it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  7. For column Host it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  8. For column Country it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
  9. For column Colection_date it may be guessed, but it is not explicitly stated what it describes. The data provider is requested to make this information available.
Properties

Unique identifier

[BGDVR196.0.0]

EUPH IRI

https://app.pollinatorhub.eu/dataset-discovery/BGDVR196.0.0

Status

Quality Validated

Peer review

No peer review was requested.

DOI

No DOI available.

Published

2025-03-17

Access rights

Open

Keywords

ABPV, BQCV, CBPV, DWV, SBV

Regions, the data was collected in

Belgique/België, Deutschland, France, Nederland, Portugal, România, Schweiz/Suisse/Svizzera, United Kingdom
Citation
B-GOOD Bee Health Data Portal Dataset from the B-GOOD project, containing GenBank accession numbers for virus sequences.. EU Pollinator Hub. [2026-02-24] app.pollinatorhub.eu
Share
Contact
No public contact details available.
You need to login in order to be able to send a direct message to Author
Dataset rating
You need to be registered in order to give a rating. No ratings available yet.
Metrics

Total views

359

Total downloads

26