Dataset: B-GOOD Health Monitoring
Dataset tables
Description | Data Points | Downloads | |
---|---|---|---|
Supplemental Files
Any supplemental files, not containing data.
Description | File Details | |||
---|---|---|---|---|
|
Dataset Report
|
This file contains in detail the structure of the dataset.
|
This is a generated file.
|
|
|
Licence
|
This file contains dataset licencing information.
|
This is a generated file.
|
|
|
Readme
|
The file contains basic information about the dataset.
|
This is a generated file.
|
The dataset contains results from the analysis of the pathogens Deformed Wing Virus A (DWV-A), Deformed Wing Virus B (DWV-B); Acute Bee Paralysis Virus (ABPV), Chronic Bee Paralysis Virus (CBPV), Black Queen Cell Virus (BQCV), Sackbrood Virus (SBV), Paenibacillus larvae; Melissococcus plutonius; Nosema apis; Nosema ceranae; Malpighamoeba mellificae; Varroa destructor in honey bee samples collected in spring, summer and autumn of 2020, 2021 and 2022 in different locations. It was published by Schäfer MO (FLI) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.
Discussion (0)
- Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
- In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
- For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
- Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
- Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
- The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
- The significance of the string {ND} is unclear:
- In columns DWV_Cat, EFB_Cat, AFB_Cat (22 records), where column dataset = {B-GOOD Pilot B results 2020 for WR V3} and SampleID = {BKCSYJMR; JKSXXKSC; LSPSHGZL; MDHLMFYT; NMYRPHNJ; YUXRGDZR; ANYXAYUZ; GNBMNLPM; HFXGDUAU; YPCTFUUU; LJSUAPFC; ZCUPCPFF; KNBULSMH; DUSFMRXB; JTYTGYDP; HXNGCDSH; KSNLKPRT; DRAZTCGC; LYZHDDGB; DXSPRLUC; CYPCTUGX; MJUYMLSH};
- In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
- In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
- In column NosemaSpores_Notes (252 records);
- The significance of the special character {-} in column AFBcfu, where dataset = {B-GOOD Pilot A results 2020 for BEEP V2 } and SampleID = {UTYBTKUM}, is unclear. Column AFBcfu_Notes was replaced by {unknown} until the issue will be resolved.
- Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
- In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
- For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
- Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
- Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
- The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
- In column SampleID the values {GB_1; GB2; GB_3} are not unique. Each of them exists twice.
- The significance of the string {ND} is unclear:
- In column EFB_Cat (3 records), where column dataset = {Tier2 Field study A results 2022 for BEEP} and SampleID = {CDYTBDHK; DTUDJNAG; RYAYUTUG};
- In column Varroa_Notes, where column dataset = {Varroa_Tier2 Field study results 2021 for WR} and SampleID = {ZUXHUFZP, BXCTGZBN CFO};
- In columns AFBcfu_Notes and NosemaSpores_Notes;