BEELIFE EUROPEAN
BEEKEEPING COORDINATION

Avenue Louise 209/7, 1050 Brussels, Belgium
info@pollinatorhub.eu • www.pollinatorhub.eu • +32 (0) 486 973 920

Dataset Report

UID:	BGDHL178.0.0
Name:	B-GOOD Health Monitoring
Title:	Dataset from the B-GOOD project, containing data from studies on diseases detected in honey bee colonies.
Status:	Approved
Version:	v. 1.0
Date:	2024-11-21
Author:	Rubinigg Michael
Citation proposal:	Rubinigg M. 2024 Report of dataset B-GOOD Health Monitoring, v. 1.0 [BGDHL178.0.0]. EU Pollinator Hub. [2025-03-29] app.pollinatorhub.eu

Compliance with FAIR* principles
Findable	Accessible	Interoperable	Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles

Data Quality

Requires major revision

This document is intended for use by collaborators of the EU Pollinator Hub and may be passed on with the express permission of the leader of the consortium and for the purpose determined by the leader of the consortium.

Table of content

Document History
1. Release
2. Revision
Abbreviations
Executive Summary
Introduction
Material and Methods
Data Description
1. Dataset
2. Tables
  1. Pilot study
  2. Field study
References
Annex 1: Table column reports

Document History

Release

Version v. 1.0 released on 2025-03-29. Written by Rubinigg Michael. Reviewed by Rubinigg Michael.

Revision

Table 1. List of revisions made to the document. Identifier of revision (No); date of revision (Date); description of revision (Description); reason for revision (Reason).

No	Date	Description	Reason
1	2025-03-29 11:03:29	Initial release.	N.A.

Abbreviations

ABPV

Acute Bee Paralysis Virus

AFB

American Foulbrood

BQCV

Black Queen Cell Virus

CBPV

Chronic Bee Paralysis Virus

CFU

Colony-Forming Units

Cycle Threshold

DWV A

Deformed Wing Virus A

DWV B

Deformed Wing Virus B

EFB

European Foulbrood

European Union

EUPH

EU Pollinator Hub

FAIR

Findability, Accessibility, Interoperability, and Reuse of digital assets

FLI

Friedrich Loeffler Institut – Bundesforschungsinstitut für Tiergesundheit

SBV

Sackbrood Virus

Executive Summary

Data overview:

The data was published by Schäfer MO (FLI) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. It contains results from the analysis of pathogens (Deformed Wing Virus A, DWV A; Deformed Wing Virus B, DWV B; Acute Bee Paralysis Virus, ABPV; Chronic Bee Paralysis Virus CBPV; Black Queen Cell Virus BQCV; Sackbrood Virus, SBV; Paenibacillus larvae; Melissococcus plutonius; Nosema apis; Nosema ceranae; Malpighamoeba mellificae; Varroa destructor) in bee samples collected in spring, summer and autumn of 2020, 2021 and 2022 in different locations.

Data value:

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description:

The datset contains two tables. One contains the results from the pilot studies (1029 records, 333,43 kB), one the results from the field studies (769 records, 233,42 kB).

Data application:

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains major issues and does not comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate and relevant attributes (*e.g. *metadata that describes the scope of the data has been described, any particularities or limitations about the data that other users should be aware of, specification of the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used, specification of whether it is raw or processed data, explanation of all variable names are explained if they are not self-explanatory) must be provided. It requires major revisions by the data provider.

Introduction

The data was published by Schäfer MO (FLI) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. The dataset contains results from the analysis of pathogens (Deformed Wing Virus A, DWV A; Deformed Wing Virus B, DWV B; Acute Bee Paralysis Virus, ABPV; Chronic Bee Paralysis Virus CBPV; Black Queen Cell Virus BQCV; Sackbrood Virus, SBV; Paenibacillus larvae; Melissococcus plutonius; Nosema apis; Nosema ceranae; Malpighamoeba mellificae; Varroa destructor) in bee samples collected in spring, summer and autumn of 2020, 2021 and 2022 in different locations.

Material and Methods

Data Acquisition

All raw data files were downloaded from the B-GOOD Bee Health Data Portal on 2024-07-04.

List of raw data and metadata files obtained from the data provider.

File b-good-pilot-a-results-2020-for-beep-v2.xlsx accesed on 2024-10-01 14:19:29, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-inrae-v2.xlsx accessed on 2024-10-01 14:21:11, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-mlu-v2.xlsx accessed on 2024-10-01 14:21:31, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-tntu-v2.xlsx accessed on 2024-10-01 14:22:01, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-ubern-v2.xlsx accessed on 2024-10-01 14:26:57, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-ucluj-v2.xlsx accessed on 2024-10-01 14:27:11, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-ucoi-v2.xlsx accessed on 2024-10-01 14:27:55, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-ugent-v2.xlsx accessed on 2024-10-01 14:28:07, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2020-for-wr-v2.xlsx accessed on 2024-10-01 14:28:19, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-b-results-2020-for-wr-v3.xlsx accessed on 2024-10-01 14:28:33, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-beep-v3.xlsx accessed on 2024-10-01 17:47:21, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-inrae-v2.xlsx accessed on 2024-10-01 17:47:36, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-mlu-v2.xlsx accessed on 2024-10-01 17:47:50, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-tntu-v2.xlsx accessed on 2024-10-01 17:48:04, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-ubern-v2.xlsx accessed on 2024-10-01 17:48:22, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-ucluj-v2.xlsx accessed on 2024-10-01 17:48:34, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-ucoi-v3.xlsx accessed on 2024-10-01 17:48:51, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2021-for-ugent-v2.xlsx accessed on 2024-10-01 17:49:05, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier1-results-2021-for-wr-v2.xlsx accessed on 2024-10-01 17:49:18, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-b-results-2021-for-wr-v2.xlsx accessed on 2024-10-01 17:49:36, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-beep.xlsx accessed on 2024-10-01 18:16:24, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-inrae.xlsx accessed on 2024-10-01 18:16:44, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-mlu.xlsx accessed on 2024-10-01 18:16:57, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-tntu.xlsx accessed on 2024-10-01 18:17:10, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-ubern.xlsx accessed on 2024-10-01 18:17:22, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-ucluj.xlsx accessed on 2024-10-01 18:17:34, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-ucoi.xlsx accessed on 2024-10-01 18:17:46, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-1-results-2022-for-ugent.xlsx accessed on 2024-10-01 18:18:09, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-a-results-2022-for-wr.xlsx accessed on 2024-10-01 18:18:24, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-pilot-b-results-2022-for-wr.xlsx accessed on 2024-10-01 18:18:41, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-beep-v3.xlsx accessed on 2024-10-01 18:27:41, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-bsour-v2.xlsx accessed on 2024-10-01 18:27:55, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-mlu-v2.xlsx accessed on 2024-10-01 18:28:08, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-sml-v2.xlsx accessed on 2024-10-01 18:28:22, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-ubern-v3.xlsx accessed on 2024-10-01 18:28:34, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2021-for-wr-v2.xlsx accessed on 2024-10-01 18:28:47, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-beep-v2.xlsx accessed on 2024-10-01 18:22:14, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-bsour.xlsx accessed on 2024-10-01 18:22:35, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-mlu.xlsx accessed on 2024-10-01 18:22:47, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-sml-v2.xlsx accessed on 2024-10-01 18:23:00, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-ubern.xlsx accessed on 2024-10-01 18:23:15, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-2-results-2022-for-wr.xlsx accessed on 2024-10-01 18:23:29, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal
File b-good-tier-3-results-2022-for-beep.xlsx accessed on 2024-10-01 18:24:20, provided by Schäfer, MO (FLI) at B-GOOD Bee Health Data Portal

Metadata was obtained from the web pages of the single datasets (see above).

Table 2. List of raw data and metadata files included in the dataset. Identifier of table row (No); name of the file (File); the type of the file (Type); file contains data (D); file contains metadata (M); date of upload of the file to the EU Pollinator Hub (Arrival); number of data points contained within the file (if applicable); uploaded file size.

No	File	Type	D	M	Arrival	Data points	File size
1	b-good-pilot-a-results-2020-for-beep-v2.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:29	n/a	41.16 KiB
2	b-good-pilot-b-results-2020-for-wr-v3.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:49	n/a	20.76 KiB
3	b-good-tier-1-results-2021-for-beep-v3.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:25	n/a	42.36 KiB
4	b-good-pilot-b-results-2021-for-wr-v2.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:57	n/a	23.09 KiB
5	b-good-tier-1-results-2022-for-beep.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:23	n/a	48.54 KiB
6	b-good-pilot-b-results-2022-for-wr.xlsx	Miscellaneous	No	Yes	2024-10-07 16:10:10	n/a	24.24 KiB
7	b-good disease monitoring_pilot_PREP_MR_241007.csv	CSV - Comma seperated values	Yes	No	2024-10-10 16:10:29	36,912	233.42 KiB
8	b-good disease monitoring_field_PREP_MR_241007.csv	CSV - Comma seperated values	Yes	No	2024-10-08 09:10:10	49,392	333.43 KiB
9	b-good-tier-2-results-2021-for-beep-v3.xlsx	Miscellaneous	No	Yes	2024-10-08 09:10:17	n/a	59.33 KiB
10	b-good-tier-2-results-2021-for-bsour-v2.xlsx	Miscellaneous	No	Yes	2024-10-08 09:10:33	n/a	22.92 KiB
11	b-good-tier-2-results-2021-for-mlu-v2.xlsx	Miscellaneous	No	Yes	2024-10-08 09:10:42	n/a	20.76 KiB
12	b-good-tier-2-results-2021-for-sml-v2.xlsx	Miscellaneous	No	Yes	2024-10-08 10:10:12	n/a	19.65 KiB
13	b-good-tier-2-results-2021-for-ubern-v3.xlsx	Miscellaneous	No	Yes	2024-10-08 10:10:20	n/a	27.10 KiB
14	b-good-tier-2-results-2021-for-wr-v2.xlsx	Miscellaneous	No	Yes	2024-10-08 10:10:14	n/a	19.56 KiB
15	b-good-tier-2-results-2022-for-beep-v2.xlsx	Miscellaneous	No	Yes	2024-10-08 10:10:11	n/a	80.10 KiB
16	b-good-tier-3-results-2022-for-beep.xlsx	Miscellaneous	No	Yes	2024-10-08 10:10:49	n/a	102.84 KiB

Data Preparation

All files in the zip-archives were extracted using File Explorer (Microsoft Corporation, version 22H2). The data files were processed using MS Excel (Microsoft Corporation, version 2409).

For datasets B-GOOD Pilot A results 2020 for MLU V2, B-GOOD Pilot A results 2020 for TNTU V2, B-GOOD Pilot A results 2020 for UBERN V2, B-GOOD Pilot A results 2020 for UCLUJ V2, and B-GOOD Pilot B results 2020 for WR V3, two apparently identical raw data files (same name, size) were provided on the B-GOOD Bee Health Data Portal. These files were compared using the Get-FileHash command of the Microsoft.PowerShell.Utility module, which computes the hash value for each raw data file by using the SHA256 hash algorithm. Hash values were then compared with MS Excel using the EXACT function. All hash values were the same for each of the duplicate raw data files, therefore, one duplicate file per dataset was deleted.

The following issues were found and resolved:

In column N. spores
- file b-good-pilot-a-results-2020-for-beep-v2.xlsx contains {<25000} while file b-good-pilot-a-results-2020-for-ucluj-v2.xlsx contains {0} for Sample ID {AHJTRZLZ}. {< 25000} has been used, as b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.
- file b-good-pilot-a-results-2020-for-beep-v2.xlsx contains {<25000} while file b-good-pilot-a-results-2020-for-inrae-v2.xlsx contains {0} for Sample ID {MUDAXHCG}. {< 25000} has been used, as b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.
- file b-good-pilot-a-results-2020-for-beep-v2.xlsx contains {<25000} while file b-good-pilot-a-results-2020-for-inrae-v2.xlsx contains {0} for Sample ID {RRMUZUBR}. {< 25000} has been used, as b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.
- file b-good-pilot-a-results-2020-for-beep-v2.xlsx contains {<25000} while file b-good-pilot-a-results-2020-for-inrae-v2.xlsx contains {0} for Sample ID {SKLRARCR}. {< 25000} has been used, as b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.
- file b-good-pilot-a-results-2020-for-beep-v2.xlsx contains {<25000} while file b-good-pilot-a-results-2020-for-inrae-v2.xlsx contains {0} for Sample ID {XLFGKPJZ}. {< 25000} has been used, as b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.

The following issues were found and resolved:

Records with sample ID {XMMDKSCT; RGYMGYTM; GFGTGBKY} contained in file b-good-tier-1-results-2022-for-ucluj.xlsx (3 records), but not in file b-good-tier-1-results-2022-for-beep.xlsx were not integrated into the dataset, as file b-good-pilot-a-results-2020-for-beep-v2.xlsx is assumed to be the single source of truth.

Tier 2 Field study 2021. Even though this is not explicitly stated in the data description, the file of dataset Tier2 Field study results 2021 for BEEP (b-good-tier-2-results-2021-for-beep-v3.xlsx) contains the data (311 records) form the following datasets, which were therefore not integrated in order to avoid duplicate records: Tier2 Field study results 2021 for BSOUR; Tier2 Field study results 2021 for MLU; Tier2 Field study results 2021 for SML; Tier2 Field study results 2021 for UBERN; Tier2 Field study results 2021 for WR.

Tier 2 Field study A 2022. Even though this is not explicitly stated in the data description, the file of dataset Tier2 Field study A results 2022 for BEEP (b-good-tier-2-results-2022-for-beep-v2.xlsx) contains the data (300 records) form the following datasets, which were therefore not integrated in order to avoid duplicate records: Tier2 Field study A results 2022 for BSOUR; Tier2 Field study A results 2022 for MLU; Tier 2 Field study A results 2022 for SML; Tier 2 Field study A results 2022 for UBERN; Tier 2 Field study A results 2022 for WR.

Data from all pilot studies and data from all field studies was transferred into one single table (pilot studies, field studies) in order to facilitate automated evaluation and visualisation of the data. Six columns were added to the beginning and one column to the end (RecordNotes) of each table, which contained the following metadata, as provided in the metadata section of each dataset on the B-GOOD Bee Health Data Portal:

name of the dataset (column dataset)
tier level of the study (column StudyTierLevel)
name of the study (column StudyName)
year of the study (column year)
name of the organisation that provided that data file (column organisation)
attribute contained in some data files, starting with the letter V followed by a number (either 2 or 3) presumably indicating the version number (column V)
pool of comments added to fields in individual records (RecordNotes)

Data was then exported to the respective preparatory files and uploaded to the EU Pollinator Hub according to SOP-017 (Dataset integration.

Data Validation

No data validation was performed.

Data Analysis

No data analysis was performed.

Data Description

Dataset

Table 3. Summary of tables belonging to the dataset. Table row identifier (No); name of the table (Table); description of the table (Description).

No	Table	Description
1	Pilot study	Data from the tier 1 pilot A and B studies performed in 2020, 2021 and 2022.
2	Field study	Data from the tier 2 and tier 3 field studies A and B performed in 2021 and 2022.

Table 4. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
UID	BGDHL178.0.0
Name	B-GOOD Health Monitoring
Title	Dataset from the B-GOOD project, containing data from studies on diseases detected in honey bee colonies.
IRI	https://app.pollinatorhub.eu/dataset-discovery/BGDHL178.0.0
Licence	CC BY-NC-ND 4.0
DOI	n/a
Creation date	2024-10-07
Publishing date	2025-03-17
Contact information	n/a
Keywords	ABPV, AFB, Acute Bee Paralysis Virus, American foulbrood, Apis mellifera, BQCV, Black Queen Cell Virus, CBPV, Chronic Bee Paralysis Virus, DWV A, DWV B, Deformed Wing Virus A, Deformed Wing Virus B, EFB, European foulbrood, Field study, SBV, Sackbrood Virus, honey bee
Data collection years	n/a
Regions, the data was collected in	n/a
Description	The dataset contains results from the analysis of the pathogens Deformed Wing Virus A (DWV-A), Deformed Wing Virus B (DWV-B); Acute Bee Paralysis Virus (ABPV), Chronic Bee Paralysis Virus (CBPV), Black Queen Cell Virus (BQCV), Sackbrood Virus (SBV), Paenibacillus larvae; Melissococcus plutonius; Nosema apis; Nosema ceranae; Malpighamoeba mellificae; Varroa destructor in honey bee samples collected in spring, summer and autumn of 2020, 2021 and 2022 in different locations. It was published by Schäfer MO (FLI) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Table 5. Standardised metadata of the data provider B-GOOD Bee Health Data Portal. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	B-GOOD Bee Health Data Portal
URL
Acronym	B-GOOD
IRI	https://app.pollinatorhub.eu/data-providers/b-good-bee-health-data-portal
Address	https://b-good-project.eu
Country	Belgium
Contact information	b-good-project.eu
Description	Project funded by the EU Horizon 2020 Research and Innovation Programme under grant agreement No 817622. Project website: https://b-good-project.eu

Tables

Pilot study

Table 6. Standardised metadata of the table. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
UID	BGDHL178.PLTST363.0
Name	Pilot study
IRI	https://app.pollinatorhub.eu/dataset-discovery/parts/BGDHL178.PLTST363.0
Type	File
Licence	CC BY-NC-ND 4.0
Description	Table Pilot study contains 769 records (233,42 kB), compiled from 6 files in datasets obtained from the B-GOOD Bee Health Data Portal on 2024-07-04: File b-good-pilot-a-results-2020-for-beep-v2.xlsx from dataset B-GOOD Pilot A results 2020 for BEEP V2 File b-good-pilot-b-results-2020-for-wr-v3.xlsx from dataset B-GOOD Pilot B results 2020 for WR V3 File b-good-tier-1-results-2021-for-beep-v3.xlsx from dataset Tier1 Pilot A results 2021 for BEEP File b-good-pilot-b-results-2021-for-wr-v2.xlsx from dataset Tier1 Pilot B results 2021 for WR File b-good-tier-1-results-2022-for-beep.xlsx from Tier1 Pilot A results 2022 for BEEP File b-good-pilot-b-results-2022-for-wr.xlsx from dataset Tier1 Pilot B results 2022 for WR

Table Pilot study contains 769 records (233,42 kB), compiled from 6 files in datasets obtained from the B-GOOD Bee Health Data Portal on 2024-07-04:

File b-good-pilot-a-results-2020-for-beep-v2.xlsx from dataset B-GOOD Pilot A results 2020 for BEEP V2
File b-good-pilot-b-results-2020-for-wr-v3.xlsx from dataset B-GOOD Pilot B results 2020 for WR V3
File b-good-tier-1-results-2021-for-beep-v3.xlsx from dataset Tier1 Pilot A results 2021 for BEEP
File b-good-pilot-b-results-2021-for-wr-v2.xlsx from dataset Tier1 Pilot B results 2021 for WR
File b-good-tier-1-results-2022-for-beep.xlsx from Tier1 Pilot A results 2022 for BEEP
File b-good-pilot-b-results-2022-for-wr.xlsx from dataset Tier1 Pilot B results 2022 for WR

Metadata

n/a

Table 7. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Name	Description	Data type	Descriptor	Unit
dataset	Name of the dataset on the Bee Health Data Portal from which the data was obtained.	String	Text [0.0.TEXTA315]	n/a
StudyTierLevel	Tier level of the study.	String	Text [0.0.TEXTA315]	n/a
StudyName	Name of the study.	String	Text [0.0.TEXTA315]	n/a
year	Calendar year in which the data was acquired.	Integer number	year [0.0.YEARA340]	year
organisation	Not specified by the data provider. Organisation appearing in the name of the dataset.	String	Text [0.0.TEXTA315]	n/a
V	Not specified by the data provider. V number appearing in the name of the dataset.	String	Text [0.0.TEXTA315]	n/a
SampleID	Unique identifier of the sample.	String	materialSampleID [0.0.MTRLS489]	n/a
partner	Not specified by the data provider. Presumably the name of the consortium partner, who the provided the data.	String	Text [0.0.TEXTA315]	n/a
season	Not specified by the data provider. Presumably the season in which the sample was collected.	String	season [0.0.SSONA466]	n/a
DWVA	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus A (DWV A).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
DWVA_Cat	Attribute referring to column DWVA, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
DWVA_Notes	Annotations referring to column DWVA. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
DWVB	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus B (DWV-B).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
DWVB_Cat	Attribute referring to column DWVB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
DWVB_Notes	Annotations referring to column DWVB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
ABPV	The Cq-value (Ct value) for the infection load with the Acute Bee Paralysis Virus (ABPV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
ABPV_Cat	Attribute referring to column ABPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
ABPV_Notes	Annotations referring to column ABPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
CBPV	The Cq-value (Ct value) for the infection load with the Chronic Bee Paralysis Virus (CBPV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
CBPV_Cat	Attribute referring to column CBPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
CBPV_Notes	Annotations referring to column CBPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
BQCV	The Cq-value (Ct value) for the infection load with the Black Queen Cell Virus (BQCV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
BQCV_Cat	Attribute referring to column BQCV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
BQCV_Notes	Annotations referring to column BQCV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
SBV	The Cq-value (Ct value) for the infection load with the Sackbrood Virus (SBV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
SBV_Cat	Attribute referring to column SBV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
SBV_Notes	Annotations referring to column SBV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
EFB	The Cq-value (Ct value) for the infection load witht the causative agent of European Foulbrood of honey bees (EFB), Melissococcus plutonius.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
EFB_Cat	Attribute referring to column EFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.	String	Text [0.0.TEXTA315]	n/a
EFB_Notes	Annotations referring to column EFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
AFB	The Cq-value (Ct value) for the infection load witht the causative agent of American Foulbrood (AFB), Paenibacillus larvae.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
AFB_Cat	Attribute referring to column AFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.	String	Text [0.0.TEXTA315]	n/a
AFB_Notes	Annotations referring to column AFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaApis	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema apis.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
NosemaApis_Notes	Annotations referring to column NosemaApis. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaCeranae	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema ceranae.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
NosemaCeranae_Notes	Annotations referring to column NosemaCeranae. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
VarroaBees	Attribute referring to column VarroaInfestation: Y (Yes) if 100 bees or more were sampled; N (No) if less than 100 bees were samples.	String	Text [0.0.TEXTA315]	n/a
VarroaInfestation	Varroa infestation, measured as Varroa infestation rate of adult bees.	Decimal number	varroaInfestationOfAdultBees [0.0.VRRNF468]	mites (100 bees)-1
Varroa_Notes	Annotations referring to column VarroaInfestation.	String	Text [0.0.TEXTA315]	n/a
AFBcfu	Not specified by the data provider. Presumably colony forming units, counted appearing in microbiological assays, which are used for the detection of the causative agent of American Foulbrood (AFB), Paenibacillus larvae.	Integer number	Integer [0.0.NTGER313]	n/a
AFBcfu_Notes	Annotations referring to column AFBcfu. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaSpores	Number of the causative agents of Nosemosis of honey bees (Nosema apis, Nosema ceranae), exprtessed in spores per animal.	Integer number	Integer [0.0.NTGER313]	spores animal-1
NosemaSpores_Notes	Annotations referring to column NosemaSpores. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file); <25000 (less than 25000 spores per animal).	String	Text [0.0.TEXTA315]	n/a
Malpighamoeba	The Cq-value (Ct value) for the infection load witht the causative agent of amoeba disease of honey bees, Malpighamoeba mellificae.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
Malpighamoeba_CT	Not specified by the data provider.	String	Text [0.0.TEXTA315]	n/a
Malpighamoeba_Notes	Annotations referring to column Malpighamoeba.	String	Text [0.0.TEXTA315]	n/a
RecordNotes	Notes added by the data provider to specific records in the raw data.	String	Text [0.0.TEXTA315]	n/a

Metadata of individual tables can be found in Annex 1.

Descriptive Measures

Table 8. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
dataset	33 - 39	n/a	B-GOOD Pilot…	n/a	n/a	n/a	Tier1 Pilot…	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 0.8% )
StudyTierLevel	6 - 6	n/a	Tier 1	n/a	n/a	n/a	Tier 1	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )
StudyName	7 - 7	n/a	Pilot A	n/a	n/a	n/a	Pilot B	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
year	4 - 4	2,021.0	2,020	2,020	2,021	2,022	2,022	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
organisation	2 - 4	n/a	BEEP	n/a	n/a	n/a	WR	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
V	0 - 2	n/a	V2	n/a	n/a	n/a	V3	769	256 ( 33.3% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
SampleID	7 - 17	n/a	ABPLTDAX	n/a	n/a	n/a	no label	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	766 ( 99.6% )
partner	2 - 5	n/a	INRAE	n/a	n/a	n/a	WR	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	10 ( 1.3% )
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
DWVA	5 - 5	30.595	10.61	28.72	31.71	34.51	39.62	769	634 ( 82.4% )	0 ( 0.0% )	0 ( 0.0% )	132 ( 17.2% )
DWVA_Cat	1 - 2	n/a	H	n/a	n/a	n/a	ND	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )
DWVA_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	135 ( 17.6% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
DWVB	4 - 5	21.900	5.95	15.2325	23.335	27.98	37.94	769	21 ( 2.7% )	0 ( 0.0% )	0 ( 0.0% )	655 ( 85.2% )
DWVB_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
DWVB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	748 ( 97.3% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
ABPV	5 - 5	34.521	13.17	33.22	35.255	37.0275	40.00	769	433 ( 56.3% )	0 ( 0.0% )	0 ( 0.0% )	288 ( 37.5% )
ABPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	769	265 ( 34.5% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )
ABPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	336 ( 43.7% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
CBPV	5 - 5	35.681	15.26	34.185	36.26	38.1125	40.00	769	493 ( 64.1% )	0 ( 0.0% )	0 ( 0.0% )	212 ( 27.6% )
CBPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	769	265 ( 34.5% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )
CBPV_Notes	0 - 13	n/a	>40,00	n/a	n/a	n/a	not availabl…	769	276 ( 35.9% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
BQCV	4 - 5	22.912	8.67	21.015	23.49	25.425	36.18	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	588 ( 76.5% )
BQCV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	M	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
BQCV_Notes	0 - 0	n/a		n/a	n/a	n/a		769	769 ( 100.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )
SBV	1 - 5	24.334	0	21.07	26.09	28.96	37.92	769	2 ( 0.3% )	1 ( 0.1% )	0 ( 0.0% )	654 ( 85.0% )
SBV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
SBV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	767 ( 99.7% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
EFB	4 - 6	31.1580	19.15	25.1	33.675	35.3	37.17	769	746 ( 97.0% )	0 ( 0.0% )	0 ( 0.0% )	24 ( 3.1% )
EFB_Cat	0 - 2	n/a	H	n/a	n/a	n/a	ND	769	485 ( 63.1% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 0.8% )
EFB_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	23 ( 3.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
AFB	2 - 6	37.5437	33.94	36.78	37.37	38.49	40	769	738 ( 96.0% )	0 ( 0.0% )	0 ( 0.0% )	29 ( 3.8% )
AFB_Cat	0 - 2	n/a	L	n/a	n/a	n/a	ND	769	485 ( 63.1% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
AFB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	516 ( 67.1% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
NosemaApis	5 - 5	27.780	27.78	n/a	n/a	n/a	27.78	769	768 ( 99.9% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
NosemaApis_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	1 ( 0.1% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
NosemaCeranae	2 - 6	25.8414	14.26	21.725	24.4	31.025	36.45	769	515 ( 67.0% )	0 ( 0.0% )	0 ( 0.0% )	245 ( 31.9% )
NosemaCeranae_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	254 ( 33.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
VarroaBees	1 - 2	n/a	N	n/a	n/a	n/a	Y	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
VarroaInfestation	1 - 11	2.2693610600	0	0	0	1.87802759	176.9230769	769	7 ( 0.9% )	418 ( 54.4% )	0 ( 0.0% )	205 ( 26.7% )
Varroa_Notes	0 - 2	n/a	ND	n/a	n/a	n/a	ND	769	762 ( 99.1% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )
AFBcfu	1 - 1	0.0	0	0	0	0	0	769	756 ( 98.3% )	13 ( 1.7% )	0 ( 0.0% )	2 ( 0.3% )
AFBcfu_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	unknown	769	13 ( 1.7% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
NosemaSpores	5 - 8	568,959.2	25,000	125,000	225,000	425,000	15,275,000	769	540 ( 70.2% )	0 ( 0.0% )	0 ( 0.0% )	53 ( 6.9% )
NosemaSpores_Notes	0 - 13	n/a		n/a	n/a	n/a	not availabl…	769	229 ( 29.8% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )
Malpighamoeba	4 - 11	34.461182790	20.94	31.24	35.2	38.16	40.27	769	738 ( 96.0% )	0 ( 0.0% )	0 ( 0.0% )	32 ( 4.2% )
Malpighamoeba_CT	0 - 1	n/a	N	n/a	n/a	n/a	Y	769	513 ( 66.7% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
Malpighamoeba_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	31 ( 4.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )
RecordNotes	0 - 0	n/a		n/a	n/a	n/a		769	769 ( 100.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )

Quality Measures

Table 9. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
dataset	100.00%	0.78%	Tier1 Pilot A results 2021 for BEEP	Tier1 Pilot B results 2022 for WR
StudyTierLevel	100.00%	0.13%	Tier 1	Tier 1
StudyName	100.00%	0.26%	Pilot A	Pilot B
year	100.00%	0.39%	2021	2020
organisation	100.00%	0.26%	BEEP	WR
V	66.71%	0.39%	V3	V2
SampleID	100.00%	99.61%	no label	ATZXSFHM
partner	100.00%	1.30%	WR	INRAe
season	100.00%	0.39%	summer	spring
DWVA	17.56%	17.17%	n/a	14.07
DWVA_Cat	100.00%	0.65%	N	H
DWVA_Notes	82.44%	0.26%	negative	n/a
DWVB	97.27%	85.18%	n/a	32.65
DWVB_Cat	100.00%	0.52%	H	N
DWVB_Notes	2.73%	0.26%	n/a	negative
ABPV	43.69%	37.45%	n/a	38.72
ABPV_Cat	65.54%	0.65%	n/a	H
ABPV_Notes	56.31%	0.39%	n/a	negative
CBPV	35.89%	27.57%	n/a	31.85
CBPV_Cat	65.54%	0.65%	n/a	H
CBPV_Notes	64.11%	0.52%	n/a	>40,00
BQCV	100.00%	76.46%	25.23	11.49
BQCV_Cat	100.00%	0.39%	H	L
BQCV_Notes	0.00%	0.13%	n/a	n/a
SBV	99.74%	85.05%	26.1	28.81
SBV_Cat	100.00%	0.52%	M	N
SBV_Notes	0.26%	0.26%	n/a	negative
EFB	2.99%	3.12%	n/a	35.07
EFB_Cat	36.93%	0.78%	n/a	H
EFB_Notes	97.01%	0.39%	not available	n/a
AFB	4.03%	3.77%	n/a	33.94
AFB_Cat	36.93%	0.52%	n/a	ND
AFB_Notes	32.90%	0.26%	n/a	negative
NosemaApis	0.13%	0.26%	n/a	27.78
NosemaApis_Notes	99.87%	0.39%	negative	n/a
NosemaCeranae	33.03%	31.86%	n/a	15.99
NosemaCeranae_Notes	66.97%	0.39%	not available	negative
VarroaBees	100.00%	0.39%	Y	ND
VarroaInfestation	99.09%	26.66%	0	4.226804124
Varroa_Notes	0.91%	0.26%	n/a	ND
AFBcfu	1.69%	0.26%	0	0
AFBcfu_Notes	98.31%	0.52%	not available	unknown
NosemaSpores	29.78%	6.89%	150000	9000000
NosemaSpores_Notes	70.22%	0.52%	not available	<25000
Malpighamoeba	4.03%	4.16%	n/a	37.41
Malpighamoeba_CT	33.29%	0.39%	n/a	Y
Malpighamoeba_Notes	95.97%	0.39%	not available	n/a
RecordNotes	0.00%	0.13%	n/a	n/a

Changes made to preparatory file

In column partner {UGent} was changed to the more commonly used acronym {UGENT} in 22 records in order to avoid potential problems with automated data analysis.
Column sample ID was renamed SampleID to avoid blank spaces in table headers, which might cause problems in some database systems.
Column DWV A was renamed DWVA to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVA was renamed DWVA_Cat to assure an assignment of unique names to column headers.
Column DWVA_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column DWV B was renamed DWVB to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVB was renamed DWVB_Cat to assure an assignment of unique names to column headers.
Column DWVB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column ABPV was renamed ABPV_Cat to assure an assignment of unique names to column headers.
Column ABPV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column CBPV was renamed CBPV_Cat to assure an assignment of unique names to column headers.
Column CBPV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column BQCV was renamed BQCV_Cat to assure an assignment of unique names to column headers.
Column BQCV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column SBV was renamed SBV_Cat to assure an assignment of unique names to column headers.
Column SBV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column EFB was renamed EFB_Cat to assure an assignment of unique names to column headers.
Column EFB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column AFB was renamed AFB_Cat to assure an assignment of unique names to column headers.
Column AFB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. apis was renamed NosemaApis to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaApis_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. ceranae was renamed NosemaCeranae to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaCeranae_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column > 100 bees was renamed VarroaBees to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa/100 bees was renamed VarroaInfestation to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column *AFB (cfu)*was renamed AFBcfu to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column AFBcfu_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. spores was renamed NosemaSpores to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column NosemaSpores_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column CT < 36,00 was renamed Malpighamoeba_CT to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Malpighamoeba_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in therelated column can be added.

Changes made to data

All occurrences of the character {-} in column DWVA_Cat were replaced by {N} (612 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVA, column DWVA_Notes was replaced by {negative} (634 records) and {negative} in column DWVA was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVB_Cat were replaced by {N} (21 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVB, column DWVB_Notes was replaced by {negative} (21 records) and the {negative} in column DWVB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column ABPV_Cat were replaced by {N} (169 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column ABPV, column ABPV_Notes was replaced by {negative} (168 records) and the {negative} in column ABPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column ABPV, column ABPV_Notes was replaced by {not available} (265 records) and the blanks in column ABPV and column ABPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
All occurrences of the character {-} and one occurrence of a blank value in column CBPV_Cat were replaced by {N} (227 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column CBPV, column CBPV_Notes was replaced by {not available} (265 records) and the blanks in column CBPV and column CBPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
In records, which contained the string {negative} in column CBPV, column CBPV_Notes was replaced by {negative} (227 records) and the {negative} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain the value {>40} in column CBPV (1 record), column CBPV_Notes was replaced by {>40} and the {>40} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column SBV_Cat were replaced by {N} (2 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column SBV, column SBV_Notes was replaced by {negative} (2 records) and the {negative} in column SBV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column EFB_Cat were replaced by {N} (239 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column EFB, column EFB_Notes was replaced by {not available} (485 records) and the blanks in column EFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column EFB, column EFB_Notes was replaced by {negative} (261 records) and the {negative} in column EFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column AFB, column AFB_Notes was replaced by {negative} (253 records) and the {negative} in column AFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column AFB_Cat were replaced by {N} (232 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column AFB, column AFB_Notes was replaced by {not available} (485 records) and the blanks in column AFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaApis, column NosemaApis_Notes was replaced by {not available} (262 records) and the blanks in column NosemaApis were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaApis, column NosemaApis_Notes was replaced by {negative} (506 records) and the {negative} in column NosemaApis was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {not available} (262 records) and the blanks in column NosemaCeranaewere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {negative} (253 records) and the {negative} in column NosemaCeranae was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column VarroaInfestation, column Varroa_Notes was replaced by {ND} (7 records) and {ND} in column VarroaInfestation was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column AFBcfu, column AFBcfu_Notes was replaced by {ND} (270 records) and {ND} in column AFBcfu was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFBcfu, column AFBcfu_Notes was replaced by {not available} (485 records) and the blanks in column AFBcfu were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In 1 record, in which column AFBcfu contained the special character {-}, where dataset = {B-GOOD Pilot A results 2020 for BEEP V2 } and SampleID = {UTYBTKUM}, column AFBcfu was replaced by {NULL} and column AFBcfu_Notes was replaced by {unknown} to avoid having special characters in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column NosemaSpores, column NosemaSpores_Notes was replaced by {ND} (252 records) and {ND} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaSpores, column NosemaSpores_Notes was replaced by {not available} (262 records) and the blanks in column NosemaSporeswere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {< 25000; <25000} in column NosemaSpores, column NosemaSpores_Notes was replaced by {<25000} (26 records) and {< 25000; <25000} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {negative} (225 records) and the {negative} in column Malpighamoeba was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained no values in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {not available} (513 records) and the blanks in column Malpighamoeba were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.

Unresolved issues

Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
The significance of the string {ND} is unclear:

In columns DWV_Cat, EFB_Cat, AFB_Cat (22 records), where column dataset = {B-GOOD Pilot B results 2020 for WR V3} and SampleID = {BKCSYJMR; JKSXXKSC; LSPSHGZL; MDHLMFYT; NMYRPHNJ; YUXRGDZR; ANYXAYUZ; GNBMNLPM; HFXGDUAU; YPCTFUUU; LJSUAPFC; ZCUPCPFF; KNBULSMH; DUSFMRXB; JTYTGYDP; HXNGCDSH; KSNLKPRT; DRAZTCGC; LYZHDDGB; DXSPRLUC; CYPCTUGX; MJUYMLSH};
In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
In column NosemaSpores_Notes (252 records);

The significance of the special character {-} in column AFBcfu, where dataset = {B-GOOD Pilot A results 2020 for BEEP V2 } and SampleID = {UTYBTKUM}, is unclear. Column AFBcfu_Notes was replaced by {unknown} until the issue will be resolved.

Field study

Table 10. Standardised metadata of the table. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
UID	BGDHL178.FLDST364.0
Name	Field study
IRI	https://app.pollinatorhub.eu/dataset-discovery/parts/BGDHL178.FLDST364.0
Type	File
Licence	CC BY-NC-ND 4.0
Description	Table Field study contains 1029 records (333,43 kB), compiled from 6 files in datasets obtained from the B-GOOD Bee Health Data Portal on 2024-07-04: File b-good-tier-2-results-2021-for-beep-v3.xlsx Tier2 Field study results 2021 for BEEP File b-good-tier-2-results-2021-for-bsour-v2.xlsx Tier2 Field study results 2021 for BSOUR File b-good-tier-2-results-2021-for-mlu-v2.xlsx Tier2 Field study results 2021 for MLU File b-good-tier-2-results-2021-for-sml-v2.xlsx Tier2 Field study results 2021 for SML File b-good-tier-2-results-2021-for-ubern-v3.xlsx Tier2 Field study results 2021 for UBERN File b-good-tier-2-results-2021-for-wr-v2.xlsx Tier2 Field study results 2021 for WR File b-good-tier-2-results-2022-for-beep-v2.xlsx Tier2 Field study A results 2022 for BEEP File b-good-tier-3-results-2022-for-beep.xlsx Tier3 Field study B results 2022 for BEEP

Table Field study contains 1029 records (333,43 kB), compiled from 6 files in datasets obtained from the B-GOOD Bee Health Data Portal on 2024-07-04:

File *b-good-tier-2-results-2021-for-beep-v3.xlsx Tier2 Field study results 2021 for BEEP
File *b-good-tier-2-results-2021-for-bsour-v2.xlsx Tier2 Field study results 2021 for BSOUR
File *b-good-tier-2-results-2021-for-mlu-v2.xlsx Tier2 Field study results 2021 for MLU
File *b-good-tier-2-results-2021-for-sml-v2.xlsx Tier2 Field study results 2021 for SML
File *b-good-tier-2-results-2021-for-ubern-v3.xlsx Tier2 Field study results 2021 for UBERN
File *b-good-tier-2-results-2021-for-wr-v2.xlsx Tier2 Field study results 2021 for WR
File *b-good-tier-2-results-2022-for-beep-v2.xlsx Tier2 Field study A results 2022 for BEEP
File *b-good-tier-3-results-2022-for-beep.xlsx Tier3 Field study B results 2022 for BEEP

Metadata

n/a

Table 11. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Name	Description	Data type	Descriptor	Unit
dataset	Name of the dataset on the Bee Health Data Portal from which the data was obtained.	String	Text [0.0.TEXTA315]	n/a
StudyTierLevel	Tier level of the study.	String	Text [0.0.TEXTA315]	n/a
StudyName	Name of the study.	String	Text [0.0.TEXTA315]	n/a
year	Calendar year in which the data was acquired.	Integer number	year [0.0.YEARA340]	year
organisation	Not specified by the data provider. Organisation appearing in the name of the dataset.	String	Text [0.0.TEXTA315]	n/a
V	Not specified by the data provider. V number appearing in the name of the dataset.	String	Text [0.0.TEXTA315]	n/a
SampleID	Unique identifier of the sample.	String	materialSampleID [0.0.MTRLS489]	n/a
partner	Not specified by the data provider. Presumably the name of the consortium partner, who the provided the data.	String	Text [0.0.TEXTA315]	n/a
season	Not specified by the data provider. Presumably the season in which the sample was collected.	String	season [0.0.SSONA466]	n/a
DWVA	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus A (DWV-A).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
DWVA_Cat	Attribute referring to column DWVA, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
DWVA_Notes	Annotations referring to column DWVA. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
DWVB	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus B (DWV-B).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
DWVB_Cat	Attribute referring to column DWVB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
DWVB_Notes	Annotations referring to column DWVB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
ABPV	The Cq-value (Ct value) for the infection load with the Acute Bee Paralysis Virus (ABPV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
ABPV_Cat	Attribute referring to column ABPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
ABPV_Notes	Annotations referring to column ABPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
CBPV	The Cq-value (Ct value) for the infection load with the Chronic Bee Paralysis Virus (CBPV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
CBPV_Cat	Attribute referring to column CBPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
CBPV_Notes	Annotations referring to column CBPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
BQCV	The Cq-value (Ct value) for the infection load with the Black Queen Cell Virus (BQCV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
BQCV_Cat	Attribute referring to column BQCV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
BQCV_Notes	Annotations referring to column BQCV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
SBV	The Cq-value (Ct value) for the infection load with the Sackbrood Virus (SBV).	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
SBV_Cat	Attribute referring to column SBV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.	String	Text [0.0.TEXTA315]	n/a
SBV_Notes	Annotations referring to column SBV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
EFB	The Cq-value (Ct value) for the infection load witht the causative agent of European Foulbrood of honey bees (EFB), Melissococcus plutonius.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
EFB_Cat	Attribute referring to column EFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.	String	Text [0.0.TEXTA315]	n/a
EFB_Notes	Annotations referring to column EFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
AFB	The Cq-value (Ct value) for the infection load witht the causative agent of American Foulbrood (AFB), Paenibacillus larvae.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
AFB_Cat	Attribute referring to column AFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.	String	Text [0.0.TEXTA315]	n/a
AFB_Notes	Annotations referring to column AFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaApis	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema apis.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
NosemaApis_Notes	Annotations referring to column NosemaApis. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaCeranae	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema ceranae.	Decimal number	quantificationCycle [0.0.QNTFC467]	no.
NosemaCeranae_Notes	Annotations referring to column NosemaCeranae. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
VarroaBees	Attribute referring to column VarroaInfestation: Y (Yes) if 100 bees or more were sampled; N (No) if less than 100 bees were samples.	String	Text [0.0.TEXTA315]	n/a
VarroaInfestation	Varroa infestation, measured as Varroa infestation rate of adult bees.	Decimal number	varroaInfestationOfAdultBees [0.0.VRRNF468]	mites (100 bees)-1
Varroa_Notes	Annotations referring to column VarroaInfestation.	String	Text [0.0.TEXTA315]	n/a
AFBcfu	Not specified by the data provider. Presumably colony forming units, counted appearing in microbiological assays, which are used for the detection of the causative agent of American Foulbrood (AFB), Paenibacillus larvae.	Integer number	Integer [0.0.NTGER313]	n/a
AFBcfu_Notes	Annotations referring to column AFBcfu. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file).	String	Text [0.0.TEXTA315]	n/a
NosemaSpores	Number of the causative agents of Nosemosis of honey bees (Nosema apis, Nosema ceranae), exprtessed in spores per animal.	Integer number	Integer [0.0.NTGER313]	spores animal-1
NosemaSpores_Notes	Annotations referring to column NosemaSpores. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file); <25000 (less than 25000 spores per animal).	String	Text [0.0.TEXTA315]	n/a
Malpighamoeba	The Cq-value (Ct value) for the infection load witht the causative agent of amoeba disease of honey bees, Malpighamoeba mellificae.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
Malpighamoeba_CT	Not specified by the data provider.	String	Text [0.0.TEXTA315]	n/a
Malpighamoeba_Notes	Annotations referring to column Malpighamoeba.	String	Text [0.0.TEXTA315]	n/a
RecordNotes	Notes added by the data provider to specific records in the raw data.	String	Text [0.0.TEXTA315]	n/a

Metadata of individual tables can be found in Annex 1.

Descriptive Measures

Table 12. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
dataset	37 - 41	n/a	Tier2 Field…	n/a	n/a	n/a	Tier3 Field…	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	8 ( 0.8% )
StudyTierLevel	6 - 6	n/a	Tier 2	n/a	n/a	n/a	Tier 3	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
StudyName	13 - 13	n/a	Field study…	n/a	n/a	n/a	Field study…	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
year	4 - 4	2,021.7	2,021	2,021	2,022	2,022	2,022	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
organisation	4 - 4	n/a	BEEP	n/a	n/a	n/a	BEEP	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )
V	0 - 2	n/a	V2	n/a	n/a	n/a	V3	1,029	418 ( 40.6% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
SampleID	1 - 14	n/a	11D2	n/a	n/a	n/a	yhgpjdya	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1,025 ( 99.6% )
partner	2 - 13	n/a	BSOUR	n/a	n/a	n/a	WR	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	16 ( 1.6% )
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
DWVA	4 - 5	29.282	6.84	26.225	31.2	35.315	38.78	1,029	728 ( 70.7% )	0 ( 0.0% )	0 ( 0.0% )	270 ( 26.2% )
DWVA_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
DWVA_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	301 ( 29.3% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
DWVB	1 - 11	23.782359580	5.5	17.26	25.7	30.68	38.45	1,029	26 ( 2.5% )	0 ( 0.0% )	0 ( 0.0% )	835 ( 81.1% )
DWVB_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
DWVB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	1,003 ( 97.5% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
ABPV	2 - 5	35.598	12	35.0475	37.025	38.4175	40	1,029	745 ( 72.4% )	0 ( 0.0% )	0 ( 0.0% )	213 ( 20.7% )
ABPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	1,029	336 ( 32.7% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.5% )
ABPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	284 ( 27.6% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
CBPV	2 - 5	36.181	13.31	35.04	36.985	38.4225	40	1,029	737 ( 71.6% )	0 ( 0.0% )	0 ( 0.0% )	221 ( 21.5% )
CBPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	1,029	337 ( 32.8% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.5% )
CBPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	292 ( 28.4% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
BQCV	2 - 6	24.8266	9.7	22.41	24.84	27.46	37.8	1,029	2 ( 0.2% )	0 ( 0.0% )	0 ( 0.0% )	743 ( 72.2% )
BQCV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
BQCV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	1,027 ( 99.8% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
SBV	1 - 11	26.404416140	0	24.2175	28.12	30.81	37.51	1,029	77 ( 7.5% )	4 ( 0.4% )	0 ( 0.0% )	736 ( 71.5% )
SBV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
SBV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	952 ( 92.5% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
EFB	2 - 7	34.43597	28.42	32.4025	35.3	36.3525	37.38	1,029	1,011 ( 98.3% )	0 ( 0.0% )	0 ( 0.0% )	19 ( 1.8% )
EFB_Cat	0 - 1	n/a	L	n/a	n/a	n/a	N	1,029	654 ( 63.6% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
EFB_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	not availabl…	1,029	18 ( 1.7% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
AFB	2 - 11	36.856439390	29.61	36.17	37.13	37.48375	40	1,029	996 ( 96.8% )	0 ( 0.0% )	0 ( 0.0% )	33 ( 3.2% )
AFB_Cat	0 - 1	n/a	L	n/a	n/a	n/a	N	1,029	696 ( 67.6% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
AFB_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	33 ( 3.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
NosemaApis	5 - 5	22.846	17.75	19.35	21.58	26.69	32.03	1,029	1,020 ( 99.1% )	0 ( 0.0% )	0 ( 0.0% )	10 ( 1.0% )
NosemaApis_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	9 ( 0.9% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
NosemaCeranae	2 - 6	25.0779	14.42	20.52	23.26	30.445	37.23	1,029	768 ( 74.6% )	0 ( 0.0% )	0 ( 0.0% )	245 ( 23.8% )
NosemaCeranae_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	261 ( 25.4% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
VarroaBees	0 - 1	n/a	N	n/a	n/a	n/a	Y	1,029	2 ( 0.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
VarroaInfestation	1 - 11	1.9824686100	0	0	0	1.61290322	46.90265487	1,029	2 ( 0.2% )	540 ( 52.5% )	0 ( 0.0% )	300 ( 29.2% )
Varroa_Notes	0 - 2	n/a	ND	n/a	n/a	n/a	ND	1,029	1,027 ( 99.8% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
AFBcfu	1 - 1	0.1	0	0	0	0	1	1,029	1,008 ( 98.0% )	19 ( 1.8% )	0 ( 0.0% )	3 ( 0.3% )
AFBcfu_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	not availabl…	1,029	21 ( 2.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
NosemaSpores	5 - 7	339,712.4	25,000	68,750	150,000	375,000	4,150,000	1,029	803 ( 78.0% )	0 ( 0.0% )	0 ( 0.0% )	47 ( 4.6% )
NosemaSpores_Notes	0 - 13	n/a		n/a	n/a	n/a	not availabl…	1,029	226 ( 22.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )
Malpighamoeba	4 - 7	29.64933	9.91	23.9275	31.63375	35.2275	42.5	1,029	947 ( 92.0% )	0 ( 0.0% )	0 ( 0.0% )	79 ( 7.7% )
Malpighamoeba_CT	0 - 1	n/a	N	n/a	n/a	n/a	Y	1,029	311 ( 30.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )
Malpighamoeba_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	393 ( 38.2% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )
RecordNotes	0 - 122	n/a	these codes…	n/a	n/a	n/a	these codes…	1,029	966 ( 93.9% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Quality Measures

Table 13. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
dataset	100.00%	0.78%	Tier3 Field study B results 2022 for BEEP	Tier2 Field study results 2021 for WR
StudyTierLevel	100.00%	0.19%	Tier 2	Tier 3
StudyName	100.00%	0.19%	Field study A	Field study B
year	100.00%	0.19%	2022	2021
organisation	100.00%	0.10%	BEEP	BEEP
V	59.38%	0.29%	n/a	V2
SampleID	100.00%	99.61%	GB_1	_467
partner	100.00%	1.55%	T3Netherlands	T3 Portugal
season	100.00%	0.29%	spring	autumn
DWVA	29.25%	26.24%	n/a	19.93
DWVA_Cat	100.00%	0.39%	N	H
DWVA_Notes	70.75%	0.19%	negative	n/a
DWVB	97.47%	81.15%	n/a	34.57
DWVB_Cat	100.00%	0.39%	M	N
DWVB_Notes	2.53%	0.19%	n/a	negative
ABPV	27.60%	20.70%	n/a	35.73
ABPV_Cat	67.35%	0.49%	N	H
ABPV_Notes	72.40%	0.29%	negative	n/a
CBPV	28.38%	21.48%	n/a	37.61
CBPV_Cat	67.25%	0.49%	N	H
CBPV_Notes	71.62%	0.29%	negative	n/a
BQCV	99.81%	72.21%	26.49	24.04
BQCV_Cat	100.00%	0.39%	M	N
BQCV_Notes	0.19%	0.19%	n/a	negative
SBV	92.52%	71.53%	n/a	33.58
SBV_Cat	100.00%	0.39%	M	L
SBV_Notes	7.48%	0.19%	n/a	negative
EFB	1.75%	1.85%	n/a	37.38
EFB_Cat	36.44%	0.39%	n/a	M
EFB_Notes	98.25%	0.39%	not available	ND
AFB	3.21%	3.21%	n/a	36.945
AFB_Cat	32.36%	0.39%	n/a	M
AFB_Notes	96.79%	0.29%	not available	n/a
NosemaApis	0.87%	0.97%	n/a	21.58
NosemaApis_Notes	99.13%	0.29%	negative	n/a
NosemaCeranae	25.36%	23.81%	n/a	21.88
NosemaCeranae_Notes	74.64%	0.29%	negative	n/a
VarroaBees	99.81%	0.29%	Y	n/a
VarroaInfestation	99.81%	29.15%	0	30.86419753
Varroa_Notes	0.19%	0.19%	n/a	ND
AFBcfu	2.04%	0.29%	0	1
AFBcfu_Notes	97.96%	0.29%	not available	n/a
NosemaSpores	21.96%	4.57%	50000	825000
NosemaSpores_Notes	78.04%	0.39%	ND	<25000
Malpighamoeba	7.97%	7.68%	n/a	39.93
Malpighamoeba_CT	69.78%	0.29%	N	Y
Malpighamoeba_Notes	61.81%	0.19%	negative	n/a
RecordNotes	6.12%	0.19%	n/a	these codes [column SampleID] might be wrong, as they could not be uploaded onto the BEEP app, or there was no code at all

Changes made to preparatory file

Column sample ID was renamed SampleID to avoid blank spaces in table headers, which might cause problems in some database systems.
Column DWV A was renamed DWVA to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVA was renamed DWVA_Cat to assure an assignment of unique names to column headers.
Column DWVA_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column DWV B was renamed DWVB to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVB was renamed DWVB_Cat to assure an assignment of unique names to column headers.
Column DWVB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column ABPV was renamed ABPV_Cat to assure an assignment of unique names to column headers.
Column ABPV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column CBPV was renamed CBPV_Cat to assure an assignment of unique names to column headers.
Column CBPV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column BQCV was renamed BQCV_Cat to assure an assignment of unique names to column headers.
Column BQCV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column SBV was renamed SBV_Cat to assure an assignment of unique names to column headers.
Column SBV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column EFB was renamed EFB_Cat to assure an assignment of unique names to column headers.
Column EFB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column AFB was renamed AFB_Cat to assure an assignment of unique names to column headers.
Column AFB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. apis was renamed NosemaApis to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaApis_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. ceranae was renamed NosemaCeranae to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaCeranae_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column > 100 bees was renamed VarroaBees to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa/100 bees was renamed VarroaInfestation to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column *AFB (cfu)*was renamed AFBcfu to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column AFBcfu_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. spores was renamed NosemaSpores to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column NosemaSpores_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column CT < 36,00 was renamed Malpighamoeba_CT to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Malpighamoeba_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.

Changes made to data

In 27 records obtained from the file of dataset Tier2 Field study A results 2022 for BEEP (b-good-tier-2-results-2022-for-beep-v2.xlsx) and 36 records obtained from the file of dataset Tier3 Field study B results 2022 for BEEP (b-good-tier-3-results-2022-for-beep.xlsx) the leading asterisk {*} was removed from the data in column SampleID and the comment referring to those SampleID ("these codes might be wrong, as they could not be uploaded onto the BEEP app, or there was no code at all"), supplemented by the explanatory text in square brackets ("[column SampleID]" ) was added to column RecordNotes of the same record. Removal of annotations to a datum is necessary to enable the records to be linked with records in other tables in relational databases.
In records, which contained the string {negative} in column DWVA, column DWVA_Notes was replaced by {negative} (728 records) and {negative} in column DWVA was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVA_Cat were replaced by {N} (727 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVB, column DWVB_Notes was replaced by {negative} (26 records) and the {negative} in column DWVB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVB_Cat were replaced by {N} (26 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
All occurrences of the character {-} in column ABPV_Cat were replaced by {NULL} (409 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column ABPV, column ABPV_Notes was replaced by {negative} (410 records) and the {negative} in column ABPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column ABPV, column ABPV_Notes was replaced by {not available} (335 records) and the blanks in column ABPV and column ABPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} and one occurrence of a blank value in column CBPV_Cat were replaced by {N} (400 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column CBPV, column CBPV_Notes was replaced by {negative} (402 records) and the {negative} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column CBPV, column CBPV_Notes was replaced by {not available} (335 records) and the blanks in column CBPV and column CBPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} and one occurrence of a blank value in column BQCV_Cat were replaced by {N} (2 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column BQCV, column BQCV_Notes was replaced by {negative} (2records) and the {negative} in column BQCV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column SBV, column SBV_Notes was replaced by {negative} (77 records) and the {negative} in column SBV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column SBV_Cat were replaced by {N} (77 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
All occurrences of the character {-} in column EFB_Cat were replaced by {N} (357 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column EFB, column EFB_Notes was replaced by {negative} (357 records) and the {negative} in column EFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column EFB, column EFB_Notes was replaced by {not available} (651 records) and the blanks in column EFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column EFB, column EFB_Notes was replaced by {ND} (3 records) and {ND} blanks in column EFB as well as blank values in column EFB_Cat were replaced by {NULL} to avoid having string or blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} in column AFB_Cat were replaced by {NULL} (300 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column AFB, column AFB_Notes was replaced by {negative} (300 records) and the {negative} in column AFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFB, column AFB_Notes was replaced by {not available} (696 records) and the blanks in column AFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaApis, column NosemaApis_Notes was replaced by {negative} (687 records) and the {negative} in column NosemaApis was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaApis, column NosemaApis_Notes was replaced by {not available} (333 records) and the blanks in column NosemaApis were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {negative} (435 records) and the {negative} in column NosemaCeranae was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {not available} (333 records) and the blanks in column NosemaCeranaewere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in columns VarroaBees and VarroaInfestation, column Varroa_Notes was replaced by {ND} (2 records) and {ND} in columns VarroaBees and VarroaInfestation was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
In records, which contain {ND} in column AFBcfu (312 records), column AFBcfu_Notes was replaced by {ND} and {ND} in column AFBcfu was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFBcfu, column AFBcfu_Notes was replaced by {not available} (696 records) and the blanks in column AFBcfu were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column NosemaSpores, column NosemaSpores_Notes was replaced by {ND} (426 records) and {ND} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaSpores, column NosemaSpores_Notes was replaced by {not available} (333 records) and the blanks in column NosemaSporeswere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {< 25000; <25000} in column NosemaSpores, column NosemaSpores_Notes was replaced by {<25000} (44 records) and {< 25000; <25000} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {negative} (636 records) and the {negative} in column Malpighamoeba was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.

Unresolved issues

Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
In column SampleID the values {GB_1; GB2; GB_3} are not unique. Each of them exists twice.
The significance of the string {ND} is unclear:

In column EFB_Cat (3 records), where column dataset = {Tier2 Field study A results 2022 for BEEP} and SampleID = {CDYTBDHK; DTUDJNAG; RYAYUTUG};
In column Varroa_Notes, where column dataset = {Varroa_Tier2 Field study results 2021 for WR} and SampleID = {ZUXHUFZP, BXCTGZBN CFO};
In columns AFBcfu_Notes and NosemaSpores_Notes;

References

Schäfer MO. 2023 Bee Health Data Portal - Dataset. [2024-10-8] beehealthdata.org
GFISCO 2024 GO FAIR initiative: Make your data & services FAIR. (en-US) GO FAIR. [2024-10-1] www.go-fair.org

Annex 1: Table column reports

Pilot study

dataset

Table 14. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	dataset
Description	Name of the dataset on the Bee Health Data Portal from which the data was obtained.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 15. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
dataset	33 - 39	n/a	B-GOOD Pilot…	n/a	n/a	n/a	Tier1 Pilot…	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 0.8% )

Table 16. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
dataset	100.00%	0.78%	Tier1 Pilot A results 2021 for BEEP	Tier1 Pilot B results 2022 for WR

Data Distribution Top 20

Figure 1. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 2. Visualization of completeness of the data in the column.

Uniqueness

Figure 3. Visualization of uniqueness of the data in the column.

StudyTierLevel

Table 17. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	StudyTierLevel
Description	Tier level of the study.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 18. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
StudyTierLevel	6 - 6	n/a	Tier 1	n/a	n/a	n/a	Tier 1	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )

Table 19. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
StudyTierLevel	100.00%	0.13%	Tier 1	Tier 1

Data Distribution Top 20

Figure 4. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 5. Visualization of completeness of the data in the column.

Uniqueness

Figure 6. Visualization of uniqueness of the data in the column.

StudyName

Table 20. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	StudyName
Description	Name of the study.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 21. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
StudyName	7 - 7	n/a	Pilot A	n/a	n/a	n/a	Pilot B	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 22. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
StudyName	100.00%	0.26%	Pilot A	Pilot B

Data Distribution Top 20

Figure 7. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 8. Visualization of completeness of the data in the column.

Uniqueness

Figure 9. Visualization of uniqueness of the data in the column.

year

Table 23. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	year
Description	Calendar year in which the data was acquired.
Data type	Integer number
Descriptor	dwc:year [UID:0.0.YEARA340]
Descriptor description	A term from the Darwin Core standard: The four-digit year in which the dwc:Event occurred, according to the Common Era Calendar.
IRI	http://rs.tdwg.org/dwc/terms/year
Unit	year

Table 24. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
year	4 - 4	2,021.0	2,020	2,020	2,021	2,022	2,022	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 25. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
year	100.00%	0.39%	2021	2020

Data Distribution Top 20

Figure 10. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 11. Distribution of values in the column.

Outliers

Figure 12. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 13. Visualization of completeness of the data in the column.

Uniqueness

Figure 14. Visualization of uniqueness of the data in the column.

organisation

Table 26. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	organisation
Description	Not specified by the data provider. Organisation appearing in the name of the dataset.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 27. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
organisation	2 - 4	n/a	BEEP	n/a	n/a	n/a	WR	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 28. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
organisation	100.00%	0.26%	BEEP	WR

Data Distribution Top 20

Figure 15. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 16. Visualization of completeness of the data in the column.

Uniqueness

Figure 17. Visualization of uniqueness of the data in the column.

V

Table 29. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	V
Description	Not specified by the data provider. V number appearing in the name of the dataset.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 30. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
V	0 - 2	n/a	V2	n/a	n/a	n/a	V3	769	256 ( 33.3% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 31. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
V	66.71%	0.39%	V3	V2

Data Distribution Top 20

Figure 18. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 19. Visualization of completeness of the data in the column.

Uniqueness

Figure 20. Visualization of uniqueness of the data in the column.

SampleID

Table 32. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SampleID
Description	Unique identifier of the sample.
Data type	String
Descriptor	dwc:materialSampleID [UID:0.0.MTRLS489]
Descriptor description	A term from the Darwin Core standard: An identifier for the dwc:MaterialSample (as opposed to a particular digital record of the dwc:MaterialSample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the dwc:materialSampleID globally unique.
IRI	http://rs.tdwg.org/dwc/terms/materialSampleID
Unit	n/a

Table 33. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SampleID	7 - 17	n/a	ABPLTDAX	n/a	n/a	n/a	no label	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	766 ( 99.6% )

Table 34. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SampleID	100.00%	99.61%	no label	ATZXSFHM

Completeness

Figure 21. Visualization of completeness of the data in the column.

Uniqueness

Figure 22. Visualization of uniqueness of the data in the column.

partner

Table 35. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	partner
Description	Not specified by the data provider. Presumably the name of the consortium partner, who the provided the data.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 36. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
partner	2 - 5	n/a	INRAE	n/a	n/a	n/a	WR	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	10 ( 1.3% )

Table 37. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
partner	100.00%	1.30%	WR	INRAe

Data Distribution Top 20

Figure 23. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 24. Visualization of completeness of the data in the column.

Uniqueness

Figure 25. Visualization of uniqueness of the data in the column.

season

Table 38. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	season
Description	Not specified by the data provider. Presumably the season in which the sample was collected.
Data type	String
Descriptor	pms:season [UID:0.0.SSONA466]
Descriptor description	[...] any of the four arbitrary divisions of the year, characterized chiefly by differences in temperature, precipitation, amount of daylight, and plant growth [...]
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.SSONA466
Unit	n/a

Table 39. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 40. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
season	100.00%	0.39%	summer	spring

Data Distribution Top 20

Figure 26. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 27. Visualization of completeness of the data in the column.

Uniqueness

Figure 28. Visualization of uniqueness of the data in the column.

DWVA

Table 41. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA
Description	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus A (DWV A).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 42. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA	5 - 5	30.595	10.61	28.72	31.71	34.51	39.62	769	634 ( 82.4% )	0 ( 0.0% )	0 ( 0.0% )	132 ( 17.2% )

Table 43. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA	17.56%	17.17%	n/a	14.07

Continuous Data Distribution

Figure 29. Distribution of values in the column.

Outliers

Figure 30. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 31. Visualization of completeness of the data in the column.

Uniqueness

Figure 32. Visualization of uniqueness of the data in the column.

DWVA_Cat

Table 44. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA_Cat
Description	Attribute referring to column DWVA, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 45. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA_Cat	1 - 2	n/a	H	n/a	n/a	n/a	ND	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )

Table 46. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA_Cat	100.00%	0.65%	N	H

Data Distribution Top 20

Figure 33. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 34. Visualization of completeness of the data in the column.

Uniqueness

Figure 35. Visualization of uniqueness of the data in the column.

DWVA_Notes

Table 47. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA_Notes
Description	Annotations referring to column DWVA. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 48. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	135 ( 17.6% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 49. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA_Notes	82.44%	0.26%	negative	n/a

Data Distribution Top 20

Figure 36. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 37. Visualization of completeness of the data in the column.

Uniqueness

Figure 38. Visualization of uniqueness of the data in the column.

DWVB

Table 50. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB
Description	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus B (DWV-B).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 51. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB	4 - 5	21.900	5.95	15.2325	23.335	27.98	37.94	769	21 ( 2.7% )	0 ( 0.0% )	0 ( 0.0% )	655 ( 85.2% )

Table 52. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB	97.27%	85.18%	n/a	32.65

Continuous Data Distribution

Figure 39. Distribution of values in the column.

Outliers

Figure 40. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 41. Visualization of completeness of the data in the column.

Uniqueness

Figure 42. Visualization of uniqueness of the data in the column.

DWVB_Cat

Table 53. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB_Cat
Description	Attribute referring to column DWVB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 54. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 55. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB_Cat	100.00%	0.52%	H	N

Data Distribution Top 20

Figure 43. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 44. Visualization of completeness of the data in the column.

Uniqueness

Figure 45. Visualization of uniqueness of the data in the column.

DWVB_Notes

Table 56. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB_Notes
Description	Annotations referring to column DWVB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 57. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	748 ( 97.3% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 58. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB_Notes	2.73%	0.26%	n/a	negative

Data Distribution Top 20

Figure 46. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 47. Visualization of completeness of the data in the column.

Uniqueness

Figure 48. Visualization of uniqueness of the data in the column.

ABPV

Table 59. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV
Description	The Cq-value (Ct value) for the infection load with the Acute Bee Paralysis Virus (ABPV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 60. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV	5 - 5	34.521	13.17	33.22	35.255	37.0275	40.00	769	433 ( 56.3% )	0 ( 0.0% )	0 ( 0.0% )	288 ( 37.5% )

Table 61. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV	43.69%	37.45%	n/a	38.72

Continuous Data Distribution

Figure 49. Distribution of values in the column.

Outliers

Figure 50. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 51. Visualization of completeness of the data in the column.

Uniqueness

Figure 52. Visualization of uniqueness of the data in the column.

ABPV_Cat

Table 62. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV_Cat
Description	Attribute referring to column ABPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 63. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	769	265 ( 34.5% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )

Table 64. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV_Cat	65.54%	0.65%	n/a	H

Data Distribution Top 20

Figure 53. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 54. Visualization of completeness of the data in the column.

Uniqueness

Figure 55. Visualization of uniqueness of the data in the column.

ABPV_Notes

Table 65. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV_Notes
Description	Annotations referring to column ABPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 66. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	336 ( 43.7% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 67. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV_Notes	56.31%	0.39%	n/a	negative

Data Distribution Top 20

Figure 56. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 57. Visualization of completeness of the data in the column.

Uniqueness

Figure 58. Visualization of uniqueness of the data in the column.

CBPV

Table 68. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV
Description	The Cq-value (Ct value) for the infection load with the Chronic Bee Paralysis Virus (CBPV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 69. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV	5 - 5	35.681	15.26	34.185	36.26	38.1125	40.00	769	493 ( 64.1% )	0 ( 0.0% )	0 ( 0.0% )	212 ( 27.6% )

Table 70. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV	35.89%	27.57%	n/a	31.85

Continuous Data Distribution

Figure 59. Distribution of values in the column.

Outliers

Figure 60. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 61. Visualization of completeness of the data in the column.

Uniqueness

Figure 62. Visualization of uniqueness of the data in the column.

CBPV_Cat

Table 71. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV_Cat
Description	Attribute referring to column CBPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 72. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	769	265 ( 34.5% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.7% )

Table 73. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV_Cat	65.54%	0.65%	n/a	H

Data Distribution Top 20

Figure 63. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 64. Visualization of completeness of the data in the column.

Uniqueness

Figure 65. Visualization of uniqueness of the data in the column.

CBPV_Notes

Table 74. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV_Notes
Description	Annotations referring to column CBPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 75. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV_Notes	0 - 13	n/a	>40,00	n/a	n/a	n/a	not availabl…	769	276 ( 35.9% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 76. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV_Notes	64.11%	0.52%	n/a	>40,00

Data Distribution Top 20

Figure 66. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 67. Visualization of completeness of the data in the column.

Uniqueness

Figure 68. Visualization of uniqueness of the data in the column.

BQCV

Table 77. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV
Description	The Cq-value (Ct value) for the infection load with the Black Queen Cell Virus (BQCV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 78. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV	4 - 5	22.912	8.67	21.015	23.49	25.425	36.18	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	588 ( 76.5% )

Table 79. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV	100.00%	76.46%	25.23	11.49

Continuous Data Distribution

Figure 69. Distribution of values in the column.

Outliers

Figure 70. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 71. Visualization of completeness of the data in the column.

Uniqueness

Figure 72. Visualization of uniqueness of the data in the column.

BQCV_Cat

Table 80. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV_Cat
Description	Attribute referring to column BQCV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 81. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	M	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 82. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV_Cat	100.00%	0.39%	H	L

Data Distribution Top 20

Figure 73. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 74. Visualization of completeness of the data in the column.

Uniqueness

Figure 75. Visualization of uniqueness of the data in the column.

BQCV_Notes

Table 83. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV_Notes
Description	Annotations referring to column BQCV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 84. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV_Notes	0 - 0	n/a		n/a	n/a	n/a		769	769 ( 100.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )

Table 85. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV_Notes	0.00%	0.13%	n/a	n/a

Data Distribution Top 20

Figure 76. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 77. Visualization of completeness of the data in the column.

Uniqueness

Figure 78. Visualization of uniqueness of the data in the column.

SBV

Table 86. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV
Description	The Cq-value (Ct value) for the infection load with the Sackbrood Virus (SBV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 87. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV	1 - 5	24.334	0	21.07	26.09	28.96	37.92	769	2 ( 0.3% )	1 ( 0.1% )	0 ( 0.0% )	654 ( 85.0% )

Table 88. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV	99.74%	85.05%	26.1	28.81

Continuous Data Distribution

Figure 79. Distribution of values in the column.

Outliers

Figure 80. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 81. Visualization of completeness of the data in the column.

Uniqueness

Figure 82. Visualization of uniqueness of the data in the column.

SBV_Cat

Table 89. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV_Cat
Description	Attribute referring to column SBV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 90. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 91. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV_Cat	100.00%	0.52%	M	N

Data Distribution Top 20

Figure 83. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 84. Visualization of completeness of the data in the column.

Uniqueness

Figure 85. Visualization of uniqueness of the data in the column.

SBV_Notes

Table 92. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV_Notes
Description	Annotations referring to column SBV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 93. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	767 ( 99.7% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 94. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV_Notes	0.26%	0.26%	n/a	negative

Data Distribution Top 20

Figure 86. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 87. Visualization of completeness of the data in the column.

Uniqueness

Figure 88. Visualization of uniqueness of the data in the column.

EFB

Table 95. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB
Description	The Cq-value (Ct value) for the infection load witht the causative agent of European Foulbrood of honey bees (EFB), Melissococcus plutonius.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 96. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB	4 - 6	31.1580	19.15	25.1	33.675	35.3	37.17	769	746 ( 97.0% )	0 ( 0.0% )	0 ( 0.0% )	24 ( 3.1% )

Table 97. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB	2.99%	3.12%	n/a	35.07

Data Distribution Top 20

Figure 89. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 90. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 91. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 92. Visualization of completeness of the data in the column.

Uniqueness

Figure 93. Visualization of uniqueness of the data in the column.

EFB_Cat

Table 98. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB_Cat
Description	Attribute referring to column EFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 99. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB_Cat	0 - 2	n/a	H	n/a	n/a	n/a	ND	769	485 ( 63.1% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 0.8% )

Table 100. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB_Cat	36.93%	0.78%	n/a	H

Data Distribution Top 20

Figure 94. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 95. Visualization of completeness of the data in the column.

Uniqueness

Figure 96. Visualization of uniqueness of the data in the column.

EFB_Notes

Table 101. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB_Notes
Description	Annotations referring to column EFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 102. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	23 ( 3.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 103. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB_Notes	97.01%	0.39%	not available	n/a

Data Distribution Top 20

Figure 97. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 98. Visualization of completeness of the data in the column.

Uniqueness

Figure 99. Visualization of uniqueness of the data in the column.

AFB

Table 104. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB
Description	The Cq-value (Ct value) for the infection load witht the causative agent of American Foulbrood (AFB), Paenibacillus larvae.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 105. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB	2 - 6	37.5437	33.94	36.78	37.37	38.49	40	769	738 ( 96.0% )	0 ( 0.0% )	0 ( 0.0% )	29 ( 3.8% )

Table 106. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB	4.03%	3.77%	n/a	33.94

Continuous Data Distribution

Figure 100. Distribution of values in the column.

Outliers

Figure 101. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 102. Visualization of completeness of the data in the column.

Uniqueness

Figure 103. Visualization of uniqueness of the data in the column.

AFB_Cat

Table 107. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB_Cat
Description	Attribute referring to column AFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 108. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB_Cat	0 - 2	n/a	L	n/a	n/a	n/a	ND	769	485 ( 63.1% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 109. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB_Cat	36.93%	0.52%	n/a	ND

Data Distribution Top 20

Figure 104. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 105. Visualization of completeness of the data in the column.

Uniqueness

Figure 106. Visualization of uniqueness of the data in the column.

AFB_Notes

Table 110. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB_Notes
Description	Annotations referring to column AFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 111. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	769	516 ( 67.1% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 112. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB_Notes	32.90%	0.26%	n/a	negative

Data Distribution Top 20

Figure 107. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 108. Visualization of completeness of the data in the column.

Uniqueness

Figure 109. Visualization of uniqueness of the data in the column.

NosemaApis

Table 113. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaApis
Description	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema apis.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 114. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaApis	5 - 5	27.780	27.78	n/a	n/a	n/a	27.78	769	768 ( 99.9% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 115. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaApis	0.13%	0.26%	n/a	27.78

Data Distribution Top 20

Figure 110. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 111. Distribution of values in the column.

Completeness

Figure 112. Visualization of completeness of the data in the column.

Uniqueness

Figure 113. Visualization of uniqueness of the data in the column.

NosemaApis_Notes

Table 116. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaApis_Notes
Description	Annotations referring to column NosemaApis. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 117. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaApis_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	1 ( 0.1% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 118. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaApis_Notes	99.87%	0.39%	negative	n/a

Data Distribution Top 20

Figure 114. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 115. Visualization of completeness of the data in the column.

Uniqueness

Figure 116. Visualization of uniqueness of the data in the column.

NosemaCeranae

Table 119. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaCeranae
Description	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema ceranae.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 120. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaCeranae	2 - 6	25.8414	14.26	21.725	24.4	31.025	36.45	769	515 ( 67.0% )	0 ( 0.0% )	0 ( 0.0% )	245 ( 31.9% )

Table 121. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaCeranae	33.03%	31.86%	n/a	15.99

Continuous Data Distribution

Figure 117. Distribution of values in the column.

Outliers

Figure 118. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 119. Visualization of completeness of the data in the column.

Uniqueness

Figure 120. Visualization of uniqueness of the data in the column.

NosemaCeranae_Notes

Table 122. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaCeranae_Notes
Description	Annotations referring to column NosemaCeranae. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 123. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaCeranae_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	254 ( 33.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 124. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaCeranae_Notes	66.97%	0.39%	not available	negative

Data Distribution Top 20

Figure 121. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 122. Visualization of completeness of the data in the column.

Uniqueness

Figure 123. Visualization of uniqueness of the data in the column.

VarroaBees

Table 125. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	VarroaBees
Description	Attribute referring to column VarroaInfestation: Y (Yes) if 100 bees or more were sampled; N (No) if less than 100 bees were samples.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 126. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
VarroaBees	1 - 2	n/a	N	n/a	n/a	n/a	Y	769	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 127. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VarroaBees	100.00%	0.39%	Y	ND

Data Distribution Top 20

Figure 124. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 125. Visualization of completeness of the data in the column.

Uniqueness

Figure 126. Visualization of uniqueness of the data in the column.

VarroaInfestation

Table 128. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	VarroaInfestation
Description	Varroa infestation, measured as Varroa infestation rate of adult bees.
Data type	Decimal number
Descriptor	pms:varroaInfestationOfAdultBees [UID:0.0.VRRNF468]
Descriptor description	The quantity infestation rate of adult honey bee colonies with Varroa mites (Varroa destructor), measured by dislodging Varroa mites from adult honey bees, expressed in number of Varroa mites per unit of honey bees.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.VRRNF468
Unit	mites (100 bees)-1

Table 129. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
VarroaInfestation	1 - 11	2.2693610600	0	0	0	1.87802759	176.9230769	769	7 ( 0.9% )	418 ( 54.4% )	0 ( 0.0% )	205 ( 26.7% )

Table 130. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VarroaInfestation	99.09%	26.66%	0	4.226804124

Continuous Data Distribution

Figure 127. Distribution of values in the column.

Completeness

Figure 128. Visualization of completeness of the data in the column.

Uniqueness

Figure 129. Visualization of uniqueness of the data in the column.

Varroa_Notes

Table 131. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Varroa_Notes
Description	Annotations referring to column VarroaInfestation.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 132. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Varroa_Notes	0 - 2	n/a	ND	n/a	n/a	n/a	ND	769	762 ( 99.1% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.3% )

Table 133. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Varroa_Notes	0.91%	0.26%	n/a	ND

Data Distribution Top 20

Figure 130. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 131. Visualization of completeness of the data in the column.

Uniqueness

Figure 132. Visualization of uniqueness of the data in the column.

AFBcfu

Table 134. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFBcfu
Description	Not specified by the data provider. Presumably colony forming units, counted appearing in microbiological assays, which are used for the detection of the causative agent of American Foulbrood (AFB), Paenibacillus larvae.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	n/a

Table 135. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFBcfu	1 - 1	0.0	0	0	0	0	0	769	756 ( 98.3% )	13 ( 1.7% )	0 ( 0.0% )	2 ( 0.3% )

Table 136. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFBcfu	1.69%	0.26%	0	0

Data Distribution Top 20

Figure 133. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 134. Distribution of values in the column.

Completeness

Figure 135. Visualization of completeness of the data in the column.

Uniqueness

Figure 136. Visualization of uniqueness of the data in the column.

AFBcfu_Notes

Table 137. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFBcfu_Notes
Description	Annotations referring to column AFBcfu. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 138. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFBcfu_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	unknown	769	13 ( 1.7% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 139. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFBcfu_Notes	98.31%	0.52%	not available	unknown

Data Distribution Top 20

Figure 137. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 138. Visualization of completeness of the data in the column.

Uniqueness

Figure 139. Visualization of uniqueness of the data in the column.

NosemaSpores

Table 140. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaSpores
Description	Number of the causative agents of Nosemosis of honey bees (Nosema apis, Nosema ceranae), exprtessed in spores per animal.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	spores animal-1

Table 141. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaSpores	5 - 8	568,959.2	25,000	125,000	225,000	425,000	15,275,000	769	540 ( 70.2% )	0 ( 0.0% )	0 ( 0.0% )	53 ( 6.9% )

Table 142. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaSpores	29.78%	6.89%	150000	9000000

Continuous Data Distribution

Figure 140. Distribution of values in the column.

Outliers

Figure 141. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 142. Visualization of completeness of the data in the column.

Uniqueness

Figure 143. Visualization of uniqueness of the data in the column.

NosemaSpores_Notes

Table 143. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaSpores_Notes
Description	Annotations referring to column NosemaSpores. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file); <25000 (less than 25000 spores per animal).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 144. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaSpores_Notes	0 - 13	n/a		n/a	n/a	n/a	not availabl…	769	229 ( 29.8% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.5% )

Table 145. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaSpores_Notes	70.22%	0.52%	not available	<25000

Data Distribution Top 20

Figure 144. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 145. Visualization of completeness of the data in the column.

Uniqueness

Figure 146. Visualization of uniqueness of the data in the column.

Malpighamoeba

Table 146. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba
Description	The Cq-value (Ct value) for the infection load witht the causative agent of amoeba disease of honey bees, Malpighamoeba mellificae.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 147. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba	4 - 11	34.461182790	20.94	31.24	35.2	38.16	40.27	769	738 ( 96.0% )	0 ( 0.0% )	0 ( 0.0% )	32 ( 4.2% )

Table 148. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba	4.03%	4.16%	n/a	37.41

Continuous Data Distribution

Figure 147. Distribution of values in the column.

Outliers

Figure 148. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 149. Visualization of completeness of the data in the column.

Uniqueness

Figure 150. Visualization of uniqueness of the data in the column.

Malpighamoeba_CT

Table 149. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba_CT
Description	Not specified by the data provider.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 150. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba_CT	0 - 1	n/a	N	n/a	n/a	n/a	Y	769	513 ( 66.7% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 151. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba_CT	33.29%	0.39%	n/a	Y

Data Distribution Top 20

Figure 151. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 152. Visualization of completeness of the data in the column.

Uniqueness

Figure 153. Visualization of uniqueness of the data in the column.

Malpighamoeba_Notes

Table 152. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba_Notes
Description	Annotations referring to column Malpighamoeba.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 153. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	769	31 ( 4.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.4% )

Table 154. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba_Notes	95.97%	0.39%	not available	n/a

Data Distribution Top 20

Figure 154. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 155. Visualization of completeness of the data in the column.

Uniqueness

Figure 156. Visualization of uniqueness of the data in the column.

RecordNotes

Table 155. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	RecordNotes
Description	Notes added by the data provider to specific records in the raw data.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 156. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
RecordNotes	0 - 0	n/a		n/a	n/a	n/a		769	769 ( 100.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )

Table 157. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
RecordNotes	0.00%	0.13%	n/a	n/a

Data Distribution Top 20

Figure 157. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 158. Visualization of completeness of the data in the column.

Uniqueness

Figure 159. Visualization of uniqueness of the data in the column.

Changes made to preparatory file

In column partner {UGent} was changed to the more commonly used acronym {UGENT} in 22 records in order to avoid potential problems with automated data analysis.
Column sample ID was renamed SampleID to avoid blank spaces in table headers, which might cause problems in some database systems.
Column DWV A was renamed DWVA to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVA was renamed DWVA_Cat to assure an assignment of unique names to column headers.
Column DWVA_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column DWV B was renamed DWVB to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVB was renamed DWVB_Cat to assure an assignment of unique names to column headers.
Column DWVB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column ABPV was renamed ABPV_Cat to assure an assignment of unique names to column headers.
Column ABPV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column CBPV was renamed CBPV_Cat to assure an assignment of unique names to column headers.
Column CBPV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column BQCV was renamed BQCV_Cat to assure an assignment of unique names to column headers.
Column BQCV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column SBV was renamed SBV_Cat to assure an assignment of unique names to column headers.
Column SBV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column EFB was renamed EFB_Cat to assure an assignment of unique names to column headers.
Column EFB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column AFB was renamed AFB_Cat to assure an assignment of unique names to column headers.
Column AFB_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. apis was renamed NosemaApis to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaApis_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. ceranae was renamed NosemaCeranae to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaCeranae_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column > 100 bees was renamed VarroaBees to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa/100 bees was renamed VarroaInfestation to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column *AFB (cfu)*was renamed AFBcfu to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column AFBcfu_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column N. spores was renamed NosemaSpores to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column NosemaSpores_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column CT < 36,00 was renamed Malpighamoeba_CT to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Malpighamoeba_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in therelated column can be added.

Changes made to data

All occurrences of the character {-} in column DWVA_Cat were replaced by {N} (612 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVA, column DWVA_Notes was replaced by {negative} (634 records) and {negative} in column DWVA was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVB_Cat were replaced by {N} (21 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVB, column DWVB_Notes was replaced by {negative} (21 records) and the {negative} in column DWVB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column ABPV_Cat were replaced by {N} (169 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column ABPV, column ABPV_Notes was replaced by {negative} (168 records) and the {negative} in column ABPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column ABPV, column ABPV_Notes was replaced by {not available} (265 records) and the blanks in column ABPV and column ABPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
All occurrences of the character {-} and one occurrence of a blank value in column CBPV_Cat were replaced by {N} (227 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column CBPV, column CBPV_Notes was replaced by {not available} (265 records) and the blanks in column CBPV and column CBPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
In records, which contained the string {negative} in column CBPV, column CBPV_Notes was replaced by {negative} (227 records) and the {negative} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain the value {>40} in column CBPV (1 record), column CBPV_Notes was replaced by {>40} and the {>40} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column SBV_Cat were replaced by {N} (2 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column SBV, column SBV_Notes was replaced by {negative} (2 records) and the {negative} in column SBV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column EFB_Cat were replaced by {N} (239 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column EFB, column EFB_Notes was replaced by {not available} (485 records) and the blanks in column EFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column EFB, column EFB_Notes was replaced by {negative} (261 records) and the {negative} in column EFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column AFB, column AFB_Notes was replaced by {negative} (253 records) and the {negative} in column AFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column AFB_Cat were replaced by {N} (232 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contain blank values in column AFB, column AFB_Notes was replaced by {not available} (485 records) and the blanks in column AFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaApis, column NosemaApis_Notes was replaced by {not available} (262 records) and the blanks in column NosemaApis were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaApis, column NosemaApis_Notes was replaced by {negative} (506 records) and the {negative} in column NosemaApis was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {not available} (262 records) and the blanks in column NosemaCeranaewere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {negative} (253 records) and the {negative} in column NosemaCeranae was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column VarroaInfestation, column Varroa_Notes was replaced by {ND} (7 records) and {ND} in column VarroaInfestation was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column AFBcfu, column AFBcfu_Notes was replaced by {ND} (270 records) and {ND} in column AFBcfu was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFBcfu, column AFBcfu_Notes was replaced by {not available} (485 records) and the blanks in column AFBcfu were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In 1 record, in which column AFBcfu contained the special character {-}, where dataset = {B-GOOD Pilot A results 2020 for BEEP V2 } and SampleID = {UTYBTKUM}, column AFBcfu was replaced by {NULL} and column AFBcfu_Notes was replaced by {unknown} to avoid having special characters in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column NosemaSpores, column NosemaSpores_Notes was replaced by {ND} (252 records) and {ND} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaSpores, column NosemaSpores_Notes was replaced by {not available} (262 records) and the blanks in column NosemaSporeswere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {< 25000; <25000} in column NosemaSpores, column NosemaSpores_Notes was replaced by {<25000} (26 records) and {< 25000; <25000} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {negative} (225 records) and the {negative} in column Malpighamoeba was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained no values in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {not available} (513 records) and the blanks in column Malpighamoeba were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.

Unresolved issues

Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
The significance of the string {ND} is unclear:

In columns DWV_Cat, EFB_Cat, AFB_Cat (22 records), where column dataset = {B-GOOD Pilot B results 2020 for WR V3} and SampleID = {BKCSYJMR; JKSXXKSC; LSPSHGZL; MDHLMFYT; NMYRPHNJ; YUXRGDZR; ANYXAYUZ; GNBMNLPM; HFXGDUAU; YPCTFUUU; LJSUAPFC; ZCUPCPFF; KNBULSMH; DUSFMRXB; JTYTGYDP; HXNGCDSH; KSNLKPRT; DRAZTCGC; LYZHDDGB; DXSPRLUC; CYPCTUGX; MJUYMLSH};
In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
In column VarroaBees and Varroa_Notes (7 records), where column dataset = {B-GOOD Pilot A results 2020 for BEEP V2} and SampleID = {CKTSXSMR; CYXUCRUN; FBYMAGCT; HCHPHKFL; RRBYHJBU; STZPSJHR; UAKMCLMN};
In column NosemaSpores_Notes (252 records);

The significance of the special character {-} in column AFBcfu, where dataset = {B-GOOD Pilot A results 2020 for BEEP V2 } and SampleID = {UTYBTKUM}, is unclear. Column AFBcfu_Notes was replaced by {unknown} until the issue will be resolved.

Field study

dataset

Table 158. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	dataset
Description	Name of the dataset on the Bee Health Data Portal from which the data was obtained.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 159. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
dataset	37 - 41	n/a	Tier2 Field…	n/a	n/a	n/a	Tier3 Field…	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	8 ( 0.8% )

Table 160. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
dataset	100.00%	0.78%	Tier3 Field study B results 2022 for BEEP	Tier2 Field study results 2021 for WR

Data Distribution Top 20

Figure 160. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 161. Visualization of completeness of the data in the column.

Uniqueness

Figure 162. Visualization of uniqueness of the data in the column.

StudyTierLevel

Table 161. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	StudyTierLevel
Description	Tier level of the study.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 162. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
StudyTierLevel	6 - 6	n/a	Tier 2	n/a	n/a	n/a	Tier 3	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 163. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
StudyTierLevel	100.00%	0.19%	Tier 2	Tier 3

Data Distribution Top 20

Figure 163. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 164. Visualization of completeness of the data in the column.

Uniqueness

Figure 165. Visualization of uniqueness of the data in the column.

StudyName

Table 164. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	StudyName
Description	Name of the study.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 165. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
StudyName	13 - 13	n/a	Field study…	n/a	n/a	n/a	Field study…	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 166. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
StudyName	100.00%	0.19%	Field study A	Field study B

Data Distribution Top 20

Figure 166. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 167. Visualization of completeness of the data in the column.

Uniqueness

Figure 168. Visualization of uniqueness of the data in the column.

year

Table 167. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	year
Description	Calendar year in which the data was acquired.
Data type	Integer number
Descriptor	dwc:year [UID:0.0.YEARA340]
Descriptor description	A term from the Darwin Core standard: The four-digit year in which the dwc:Event occurred, according to the Common Era Calendar.
IRI	http://rs.tdwg.org/dwc/terms/year
Unit	year

Table 168. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
year	4 - 4	2,021.7	2,021	2,021	2,022	2,022	2,022	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 169. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
year	100.00%	0.19%	2022	2021

Data Distribution Top 20

Figure 169. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 170. Distribution of values in the column.

Outliers

Figure 171. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 172. Visualization of completeness of the data in the column.

Uniqueness

Figure 173. Visualization of uniqueness of the data in the column.

organisation

Table 170. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	organisation
Description	Not specified by the data provider. Organisation appearing in the name of the dataset.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 171. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
organisation	4 - 4	n/a	BEEP	n/a	n/a	n/a	BEEP	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1 ( 0.1% )

Table 172. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
organisation	100.00%	0.10%	BEEP	BEEP

Data Distribution Top 20

Figure 174. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 175. Visualization of completeness of the data in the column.

Uniqueness

Figure 176. Visualization of uniqueness of the data in the column.

V

Table 173. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	V
Description	Not specified by the data provider. V number appearing in the name of the dataset.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 174. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
V	0 - 2	n/a	V2	n/a	n/a	n/a	V3	1,029	418 ( 40.6% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 175. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
V	59.38%	0.29%	n/a	V2

Data Distribution Top 20

Figure 177. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 178. Visualization of completeness of the data in the column.

Uniqueness

Figure 179. Visualization of uniqueness of the data in the column.

SampleID

Table 176. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SampleID
Description	Unique identifier of the sample.
Data type	String
Descriptor	dwc:materialSampleID [UID:0.0.MTRLS489]
Descriptor description	A term from the Darwin Core standard: An identifier for the dwc:MaterialSample (as opposed to a particular digital record of the dwc:MaterialSample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the dwc:materialSampleID globally unique.
IRI	http://rs.tdwg.org/dwc/terms/materialSampleID
Unit	n/a

Table 177. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SampleID	1 - 14	n/a	11D2	n/a	n/a	n/a	yhgpjdya	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	1,025 ( 99.6% )

Table 178. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SampleID	100.00%	99.61%	GB_1	_467

Completeness

Figure 180. Visualization of completeness of the data in the column.

Uniqueness

Figure 181. Visualization of uniqueness of the data in the column.

partner

Table 179. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	partner
Description	Not specified by the data provider. Presumably the name of the consortium partner, who the provided the data.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 180. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
partner	2 - 13	n/a	BSOUR	n/a	n/a	n/a	WR	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	16 ( 1.6% )

Table 181. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
partner	100.00%	1.55%	T3Netherlands	T3 Portugal

Data Distribution Top 20

Figure 182. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 183. Visualization of completeness of the data in the column.

Uniqueness

Figure 184. Visualization of uniqueness of the data in the column.

season

Table 182. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	season
Description	Not specified by the data provider. Presumably the season in which the sample was collected.
Data type	String
Descriptor	pms:season [UID:0.0.SSONA466]
Descriptor description	[...] any of the four arbitrary divisions of the year, characterized chiefly by differences in temperature, precipitation, amount of daylight, and plant growth [...]
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.SSONA466
Unit	n/a

Table 183. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 184. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
season	100.00%	0.29%	spring	autumn

Data Distribution Top 20

Figure 185. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 186. Visualization of completeness of the data in the column.

Uniqueness

Figure 187. Visualization of uniqueness of the data in the column.

DWVA

Table 185. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA
Description	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus A (DWV-A).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 186. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA	4 - 5	29.282	6.84	26.225	31.2	35.315	38.78	1,029	728 ( 70.7% )	0 ( 0.0% )	0 ( 0.0% )	270 ( 26.2% )

Table 187. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA	29.25%	26.24%	n/a	19.93

Continuous Data Distribution

Figure 188. Distribution of values in the column.

Outliers

Figure 189. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 190. Visualization of completeness of the data in the column.

Uniqueness

Figure 191. Visualization of uniqueness of the data in the column.

DWVA_Cat

Table 188. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA_Cat
Description	Attribute referring to column DWVA, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 189. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 190. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA_Cat	100.00%	0.39%	N	H

Data Distribution Top 20

Figure 192. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 193. Visualization of completeness of the data in the column.

Uniqueness

Figure 194. Visualization of uniqueness of the data in the column.

DWVA_Notes

Table 191. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVA_Notes
Description	Annotations referring to column DWVA. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 192. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVA_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	301 ( 29.3% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 193. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVA_Notes	70.75%	0.19%	negative	n/a

Data Distribution Top 20

Figure 195. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 196. Visualization of completeness of the data in the column.

Uniqueness

Figure 197. Visualization of uniqueness of the data in the column.

DWVB

Table 194. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB
Description	The Cq-value (Ct value) for the infection load with the Deformed Wing Virus B (DWV-B).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 195. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB	1 - 11	23.782359580	5.5	17.26	25.7	30.68	38.45	1,029	26 ( 2.5% )	0 ( 0.0% )	0 ( 0.0% )	835 ( 81.1% )

Table 196. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB	97.47%	81.15%	n/a	34.57

Continuous Data Distribution

Figure 198. Distribution of values in the column.

Outliers

Figure 199. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 200. Visualization of completeness of the data in the column.

Uniqueness

Figure 201. Visualization of uniqueness of the data in the column.

DWVB_Cat

Table 197. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB_Cat
Description	Attribute referring to column DWVB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 198. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 199. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB_Cat	100.00%	0.39%	M	N

Data Distribution Top 20

Figure 202. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 203. Visualization of completeness of the data in the column.

Uniqueness

Figure 204. Visualization of uniqueness of the data in the column.

DWVB_Notes

Table 200. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	DWVB_Notes
Description	Annotations referring to column DWVB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 201. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
DWVB_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	1,003 ( 97.5% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 202. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
DWVB_Notes	2.53%	0.19%	n/a	negative

Data Distribution Top 20

Figure 205. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 206. Visualization of completeness of the data in the column.

Uniqueness

Figure 207. Visualization of uniqueness of the data in the column.

ABPV

Table 203. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV
Description	The Cq-value (Ct value) for the infection load with the Acute Bee Paralysis Virus (ABPV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 204. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV	2 - 5	35.598	12	35.0475	37.025	38.4175	40	1,029	745 ( 72.4% )	0 ( 0.0% )	0 ( 0.0% )	213 ( 20.7% )

Table 205. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV	27.60%	20.70%	n/a	35.73

Continuous Data Distribution

Figure 208. Distribution of values in the column.

Outliers

Figure 209. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 210. Visualization of completeness of the data in the column.

Uniqueness

Figure 211. Visualization of uniqueness of the data in the column.

ABPV_Cat

Table 206. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV_Cat
Description	Attribute referring to column ABPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 207. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	1,029	336 ( 32.7% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.5% )

Table 208. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV_Cat	67.35%	0.49%	N	H

Data Distribution Top 20

Figure 212. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 213. Visualization of completeness of the data in the column.

Uniqueness

Figure 214. Visualization of uniqueness of the data in the column.

ABPV_Notes

Table 209. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	ABPV_Notes
Description	Annotations referring to column ABPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 210. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
ABPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	284 ( 27.6% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 211. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
ABPV_Notes	72.40%	0.29%	negative	n/a

Data Distribution Top 20

Figure 215. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 216. Visualization of completeness of the data in the column.

Uniqueness

Figure 217. Visualization of uniqueness of the data in the column.

CBPV

Table 212. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV
Description	The Cq-value (Ct value) for the infection load with the Chronic Bee Paralysis Virus (CBPV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 213. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV	2 - 5	36.181	13.31	35.04	36.985	38.4225	40	1,029	737 ( 71.6% )	0 ( 0.0% )	0 ( 0.0% )	221 ( 21.5% )

Table 214. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV	28.38%	21.48%	n/a	37.61

Continuous Data Distribution

Figure 218. Distribution of values in the column.

Outliers

Figure 219. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 220. Visualization of completeness of the data in the column.

Uniqueness

Figure 221. Visualization of uniqueness of the data in the column.

CBPV_Cat

Table 215. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV_Cat
Description	Attribute referring to column CBPV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 216. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV_Cat	0 - 1	n/a	H	n/a	n/a	n/a	N	1,029	337 ( 32.8% )	0 ( 0.0% )	0 ( 0.0% )	5 ( 0.5% )

Table 217. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV_Cat	67.25%	0.49%	N	H

Data Distribution Top 20

Figure 222. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 223. Visualization of completeness of the data in the column.

Uniqueness

Figure 224. Visualization of uniqueness of the data in the column.

CBPV_Notes

Table 218. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	CBPV_Notes
Description	Annotations referring to column CBPV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 219. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
CBPV_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	292 ( 28.4% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 220. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
CBPV_Notes	71.62%	0.29%	negative	n/a

Data Distribution Top 20

Figure 225. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 226. Visualization of completeness of the data in the column.

Uniqueness

Figure 227. Visualization of uniqueness of the data in the column.

BQCV

Table 221. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV
Description	The Cq-value (Ct value) for the infection load with the Black Queen Cell Virus (BQCV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 222. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV	2 - 6	24.8266	9.7	22.41	24.84	27.46	37.8	1,029	2 ( 0.2% )	0 ( 0.0% )	0 ( 0.0% )	743 ( 72.2% )

Table 223. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV	99.81%	72.21%	26.49	24.04

Continuous Data Distribution

Figure 228. Distribution of values in the column.

Outliers

Figure 229. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 230. Visualization of completeness of the data in the column.

Uniqueness

Figure 231. Visualization of uniqueness of the data in the column.

BQCV_Cat

Table 224. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV_Cat
Description	Attribute referring to column BQCV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 225. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 226. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV_Cat	100.00%	0.39%	M	N

Data Distribution Top 20

Figure 232. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 233. Visualization of completeness of the data in the column.

Uniqueness

Figure 234. Visualization of uniqueness of the data in the column.

BQCV_Notes

Table 227. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	BQCV_Notes
Description	Annotations referring to column BQCV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 228. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
BQCV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	1,027 ( 99.8% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 229. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
BQCV_Notes	0.19%	0.19%	n/a	negative

Data Distribution Top 20

Figure 235. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 236. Visualization of completeness of the data in the column.

Uniqueness

Figure 237. Visualization of uniqueness of the data in the column.

SBV

Table 230. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV
Description	The Cq-value (Ct value) for the infection load with the Sackbrood Virus (SBV).
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 231. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV	1 - 11	26.404416140	0	24.2175	28.12	30.81	37.51	1,029	77 ( 7.5% )	4 ( 0.4% )	0 ( 0.0% )	736 ( 71.5% )

Table 232. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV	92.52%	71.53%	n/a	33.58

Continuous Data Distribution

Figure 238. Distribution of values in the column.

Outliers

Figure 239. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 240. Visualization of completeness of the data in the column.

Uniqueness

Figure 241. Visualization of uniqueness of the data in the column.

SBV_Cat

Table 233. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV_Cat
Description	Attribute referring to column SBV, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). For Virus: L means < 10^4 genome copies, M means between 10^4 and 10^7 genome copies, and H means > 10^7 genome copies per bee.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 234. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV_Cat	1 - 1	n/a	H	n/a	n/a	n/a	N	1,029	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 235. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV_Cat	100.00%	0.39%	M	L

Data Distribution Top 20

Figure 242. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 243. Visualization of completeness of the data in the column.

Uniqueness

Figure 244. Visualization of uniqueness of the data in the column.

SBV_Notes

Table 236. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	SBV_Notes
Description	Annotations referring to column SBV. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 237. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
SBV_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	952 ( 92.5% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 238. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SBV_Notes	7.48%	0.19%	n/a	negative

Data Distribution Top 20

Figure 245. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 246. Visualization of completeness of the data in the column.

Uniqueness

Figure 247. Visualization of uniqueness of the data in the column.

EFB

Table 239. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB
Description	The Cq-value (Ct value) for the infection load witht the causative agent of European Foulbrood of honey bees (EFB), Melissococcus plutonius.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 240. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB	2 - 7	34.43597	28.42	32.4025	35.3	36.3525	37.38	1,029	1,011 ( 98.3% )	0 ( 0.0% )	0 ( 0.0% )	19 ( 1.8% )

Table 241. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB	1.75%	1.85%	n/a	37.38

Data Distribution Top 20

Figure 248. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 249. Distribution of values in the column.

Outliers

Figure 250. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 251. Visualization of completeness of the data in the column.

Uniqueness

Figure 252. Visualization of uniqueness of the data in the column.

EFB_Cat

Table 242. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB_Cat
Description	Attribute referring to column EFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 243. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB_Cat	0 - 1	n/a	L	n/a	n/a	n/a	N	1,029	654 ( 63.6% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 244. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB_Cat	36.44%	0.39%	n/a	M

Data Distribution Top 20

Figure 253. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 254. Visualization of completeness of the data in the column.

Uniqueness

Figure 255. Visualization of uniqueness of the data in the column.

EFB_Notes

Table 245. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EFB_Notes
Description	Annotations referring to column EFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 246. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
EFB_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	not availabl…	1,029	18 ( 1.7% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 247. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
EFB_Notes	98.25%	0.39%	not available	ND

Data Distribution Top 20

Figure 256. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 257. Visualization of completeness of the data in the column.

Uniqueness

Figure 258. Visualization of uniqueness of the data in the column.

AFB

Table 248. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB
Description	The Cq-value (Ct value) for the infection load witht the causative agent of American Foulbrood (AFB), Paenibacillus larvae.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 249. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB	2 - 11	36.856439390	29.61	36.17	37.13	37.48375	40	1,029	996 ( 96.8% )	0 ( 0.0% )	0 ( 0.0% )	33 ( 3.2% )

Table 250. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB	3.21%	3.21%	n/a	36.945

Continuous Data Distribution

Figure 259. Distribution of values in the column.

Outliers

Figure 260. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 261. Visualization of completeness of the data in the column.

Uniqueness

Figure 262. Visualization of uniqueness of the data in the column.

AFB_Cat

Table 251. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB_Cat
Description	Attribute referring to column AFB, which is given if different dilutions of a DNA plasmid, containing the target sequence, were added, to provide an estimate of the amount of DNA or RNA present, or if no dilutions were added: L (Low); M (Medium); H (High); N (None). It is not specified if the number of genome copies for th edifferent categories also refer to this column.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 252. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB_Cat	0 - 1	n/a	L	n/a	n/a	n/a	N	1,029	696 ( 67.6% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 253. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB_Cat	32.36%	0.39%	n/a	M

Data Distribution Top 20

Figure 263. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 264. Visualization of completeness of the data in the column.

Uniqueness

Figure 265. Visualization of uniqueness of the data in the column.

AFB_Notes

Table 254. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFB_Notes
Description	Annotations referring to column AFB. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 255. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFB_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	33 ( 3.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 256. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFB_Notes	96.79%	0.29%	not available	n/a

Data Distribution Top 20

Figure 266. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 267. Visualization of completeness of the data in the column.

Uniqueness

Figure 268. Visualization of uniqueness of the data in the column.

NosemaApis

Table 257. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaApis
Description	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema apis.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 258. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaApis	5 - 5	22.846	17.75	19.35	21.58	26.69	32.03	1,029	1,020 ( 99.1% )	0 ( 0.0% )	0 ( 0.0% )	10 ( 1.0% )

Table 259. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaApis	0.87%	0.97%	n/a	21.58

Data Distribution Top 20

Figure 269. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 270. Distribution of values in the column.

Outliers

Figure 271. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 272. Visualization of completeness of the data in the column.

Uniqueness

Figure 273. Visualization of uniqueness of the data in the column.

NosemaApis_Notes

Table 260. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaApis_Notes
Description	Annotations referring to column NosemaApis. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 261. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaApis_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	9 ( 0.9% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 262. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaApis_Notes	99.13%	0.29%	negative	n/a

Data Distribution Top 20

Figure 274. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 275. Visualization of completeness of the data in the column.

Uniqueness

Figure 276. Visualization of uniqueness of the data in the column.

NosemaCeranae

Table 263. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaCeranae
Description	The Cq-value (Ct value) for the infection load with one of the causative agents of Nosemosis of honey bees, Nosema ceranae.
Data type	Decimal number
Descriptor	pms:quantificationCycle [UID:0.0.QNTFC467]
Descriptor description	Depending on the real-time instrument, either threshold cycle (Ct), crossing point (Cp) or a take-off point (Top) are used to refer to the same quantification cycle value (Cq): the fractional PCR cycle at which the target is quantified in a given sample. It was proposed to use the term quantification cycle (Cq) in accordance with the data standard RDML (Real-Time PCR Data Markup Language)
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.QNTFC467
Unit	no.

Table 264. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaCeranae	2 - 6	25.0779	14.42	20.52	23.26	30.445	37.23	1,029	768 ( 74.6% )	0 ( 0.0% )	0 ( 0.0% )	245 ( 23.8% )

Table 265. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaCeranae	25.36%	23.81%	n/a	21.88

Continuous Data Distribution

Figure 277. Distribution of values in the column.

Outliers

Figure 278. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 279. Visualization of completeness of the data in the column.

Uniqueness

Figure 280. Visualization of uniqueness of the data in the column.

NosemaCeranae_Notes

Table 266. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaCeranae_Notes
Description	Annotations referring to column NosemaCeranae. Negative (signal threshold not exceeded); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 267. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaCeranae_Notes	0 - 13	n/a	negative	n/a	n/a	n/a	not availabl…	1,029	261 ( 25.4% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 268. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaCeranae_Notes	74.64%	0.29%	negative	n/a

Data Distribution Top 20

Figure 281. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 282. Visualization of completeness of the data in the column.

Uniqueness

Figure 283. Visualization of uniqueness of the data in the column.

VarroaBees

Table 269. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	VarroaBees
Description	Attribute referring to column VarroaInfestation: Y (Yes) if 100 bees or more were sampled; N (No) if less than 100 bees were samples.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 270. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
VarroaBees	0 - 1	n/a	N	n/a	n/a	n/a	Y	1,029	2 ( 0.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 271. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VarroaBees	99.81%	0.29%	Y	n/a

Data Distribution Top 20

Figure 284. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 285. Visualization of completeness of the data in the column.

Uniqueness

Figure 286. Visualization of uniqueness of the data in the column.

VarroaInfestation

Table 272. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	VarroaInfestation
Description	Varroa infestation, measured as Varroa infestation rate of adult bees.
Data type	Decimal number
Descriptor	pms:varroaInfestationOfAdultBees [UID:0.0.VRRNF468]
Descriptor description	The quantity infestation rate of adult honey bee colonies with Varroa mites (Varroa destructor), measured by dislodging Varroa mites from adult honey bees, expressed in number of Varroa mites per unit of honey bees.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.VRRNF468
Unit	mites (100 bees)-1

Table 273. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
VarroaInfestation	1 - 11	1.9824686100	0	0	0	1.61290322	46.90265487	1,029	2 ( 0.2% )	540 ( 52.5% )	0 ( 0.0% )	300 ( 29.2% )

Table 274. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VarroaInfestation	99.81%	29.15%	0	30.86419753

Continuous Data Distribution

Figure 287. Distribution of values in the column.

Completeness

Figure 288. Visualization of completeness of the data in the column.

Uniqueness

Figure 289. Visualization of uniqueness of the data in the column.

Varroa_Notes

Table 275. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Varroa_Notes
Description	Annotations referring to column VarroaInfestation.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 276. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Varroa_Notes	0 - 2	n/a	ND	n/a	n/a	n/a	ND	1,029	1,027 ( 99.8% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 277. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Varroa_Notes	0.19%	0.19%	n/a	ND

Data Distribution Top 20

Figure 290. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 291. Visualization of completeness of the data in the column.

Uniqueness

Figure 292. Visualization of uniqueness of the data in the column.

AFBcfu

Table 278. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFBcfu
Description	Not specified by the data provider. Presumably colony forming units, counted appearing in microbiological assays, which are used for the detection of the causative agent of American Foulbrood (AFB), Paenibacillus larvae.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	n/a

Table 279. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFBcfu	1 - 1	0.1	0	0	0	0	1	1,029	1,008 ( 98.0% )	19 ( 1.8% )	0 ( 0.0% )	3 ( 0.3% )

Table 280. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFBcfu	2.04%	0.29%	0	1

Data Distribution Top 20

Figure 293. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 294. Distribution of values in the column.

Completeness

Figure 295. Visualization of completeness of the data in the column.

Uniqueness

Figure 296. Visualization of uniqueness of the data in the column.

AFBcfu_Notes

Table 281. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	AFBcfu_Notes
Description	Annotations referring to column AFBcfu. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 282. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
AFBcfu_Notes	0 - 13	n/a	ND	n/a	n/a	n/a	not availabl…	1,029	21 ( 2.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 283. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
AFBcfu_Notes	97.96%	0.29%	not available	n/a

Data Distribution Top 20

Figure 297. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 298. Visualization of completeness of the data in the column.

Uniqueness

Figure 299. Visualization of uniqueness of the data in the column.

NosemaSpores

Table 284. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaSpores
Description	Number of the causative agents of Nosemosis of honey bees (Nosema apis, Nosema ceranae), exprtessed in spores per animal.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	spores animal-1

Table 285. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaSpores	5 - 7	339,712.4	25,000	68,750	150,000	375,000	4,150,000	1,029	803 ( 78.0% )	0 ( 0.0% )	0 ( 0.0% )	47 ( 4.6% )

Table 286. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaSpores	21.96%	4.57%	50000	825000

Continuous Data Distribution

Figure 300. Distribution of values in the column.

Outliers

Figure 301. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 302. Visualization of completeness of the data in the column.

Uniqueness

Figure 303. Visualization of uniqueness of the data in the column.

NosemaSpores_Notes

Table 287. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	NosemaSpores_Notes
Description	Annotations referring to column NosemaSpores. ND (meaning not specified by the data provider); not available (data has not been provided in the raw data file); <25000 (less than 25000 spores per animal).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 288. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
NosemaSpores_Notes	0 - 13	n/a		n/a	n/a	n/a	not availabl…	1,029	226 ( 22.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 0.4% )

Table 289. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
NosemaSpores_Notes	78.04%	0.39%	ND	<25000

Data Distribution Top 20

Figure 304. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 305. Visualization of completeness of the data in the column.

Uniqueness

Figure 306. Visualization of uniqueness of the data in the column.

Malpighamoeba

Table 290. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba
Description	The Cq-value (Ct value) for the infection load witht the causative agent of amoeba disease of honey bees, Malpighamoeba mellificae.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 291. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba	4 - 7	29.64933	9.91	23.9275	31.63375	35.2275	42.5	1,029	947 ( 92.0% )	0 ( 0.0% )	0 ( 0.0% )	79 ( 7.7% )

Table 292. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba	7.97%	7.68%	n/a	39.93

Continuous Data Distribution

Figure 307. Distribution of values in the column.

Outliers

Figure 308. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 309. Visualization of completeness of the data in the column.

Uniqueness

Figure 310. Visualization of uniqueness of the data in the column.

Malpighamoeba_CT

Table 293. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba_CT
Description	Not specified by the data provider.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 294. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba_CT	0 - 1	n/a	N	n/a	n/a	n/a	Y	1,029	311 ( 30.2% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.3% )

Table 295. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba_CT	69.78%	0.29%	N	Y

Data Distribution Top 20

Figure 311. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 312. Visualization of completeness of the data in the column.

Uniqueness

Figure 313. Visualization of uniqueness of the data in the column.

Malpighamoeba_Notes

Table 296. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	Malpighamoeba_Notes
Description	Annotations referring to column Malpighamoeba.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 297. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
Malpighamoeba_Notes	0 - 8	n/a	negative	n/a	n/a	n/a	negative	1,029	393 ( 38.2% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 298. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Malpighamoeba_Notes	61.81%	0.19%	negative	n/a

Data Distribution Top 20

Figure 314. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 315. Visualization of completeness of the data in the column.

Uniqueness

Figure 316. Visualization of uniqueness of the data in the column.

RecordNotes

Table 299. Standardised metadata of the column. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	RecordNotes
Description	Notes added by the data provider to specific records in the raw data.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 300. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Name	Length	Mean	Min	Q1	Median	Q3	Max	Total	Missing	Zero	Blank	Distinct
RecordNotes	0 - 122	n/a	these codes…	n/a	n/a	n/a	these codes…	1,029	966 ( 93.9% )	0 ( 0.0% )	0 ( 0.0% )	2 ( 0.2% )

Table 301. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Name	Completeness	Uniqueness	Most Common Value	Least Common Value
RecordNotes	6.12%	0.19%	n/a	these codes [column SampleID] might be wrong, as they could not be uploaded onto the BEEP app, or there was no code at all

Data Distribution Top 20

Figure 317. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 318. Visualization of completeness of the data in the column.

Uniqueness

Figure 319. Visualization of uniqueness of the data in the column.

Changes made to preparatory file

Column sample ID was renamed SampleID to avoid blank spaces in table headers, which might cause problems in some database systems.
Column DWV A was renamed DWVA to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVA was renamed DWVA_Cat to assure an assignment of unique names to column headers.
Column DWVA_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column DWV B was renamed DWVB to avoid blank spaces in table headers, which might cause problems in some database systems.
Column Cat. relating to column DWVB was renamed DWVB_Cat to assure an assignment of unique names to column headers.
Column DWVB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column ABPV was renamed ABPV_Cat to assure an assignment of unique names to column headers.
Column ABPV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column CBPV was renamed CBPV_Cat to assure an assignment of unique names to column headers.
Column CBPV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column BQCV was renamed BQCV_Cat to assure an assignment of unique names to column headers.
Column BQCV_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column SBV was renamed SBV_Cat to assure an assignment of unique names to column headers.
Column SBV_Notes was created and replaced with {NULL} (769 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column EFB was renamed EFB_Cat to assure an assignment of unique names to column headers.
Column EFB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column Cat. relating to column AFB was renamed AFB_Cat to assure an assignment of unique names to column headers.
Column AFB_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. apis was renamed NosemaApis to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaApis_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. ceranae was renamed NosemaCeranae to avoid blank spaces in table headers, which might cause problems in some database systems.
Column NosemaCeranae_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column > 100 bees was renamed VarroaBees to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa/100 bees was renamed VarroaInfestation to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Varroa_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column *AFB (cfu)*was renamed AFBcfu to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column AFBcfu_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column N. spores was renamed NosemaSpores to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column NosemaSpores_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.
Column CT < 36,00 was renamed Malpighamoeba_CT to avoid blank spaces and special characters in table headers, which might cause problems in some database systems.
Column Malpighamoeba_Notes was created and replaced with {NULL} (1029 records) to add a column in which notes on the data in the related column can be added.

Changes made to data

In 27 records obtained from the file of dataset Tier2 Field study A results 2022 for BEEP (b-good-tier-2-results-2022-for-beep-v2.xlsx) and 36 records obtained from the file of dataset Tier3 Field study B results 2022 for BEEP (b-good-tier-3-results-2022-for-beep.xlsx) the leading asterisk {*} was removed from the data in column SampleID and the comment referring to those SampleID ("these codes might be wrong, as they could not be uploaded onto the BEEP app, or there was no code at all"), supplemented by the explanatory text in square brackets ("[column SampleID]" ) was added to column RecordNotes of the same record. Removal of annotations to a datum is necessary to enable the records to be linked with records in other tables in relational databases.
In records, which contained the string {negative} in column DWVA, column DWVA_Notes was replaced by {negative} (728 records) and {negative} in column DWVA was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVA_Cat were replaced by {N} (727 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column DWVB, column DWVB_Notes was replaced by {negative} (26 records) and the {negative} in column DWVB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column DWVB_Cat were replaced by {N} (26 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
All occurrences of the character {-} in column ABPV_Cat were replaced by {NULL} (409 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column ABPV, column ABPV_Notes was replaced by {negative} (410 records) and the {negative} in column ABPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column ABPV, column ABPV_Notes was replaced by {not available} (335 records) and the blanks in column ABPV and column ABPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} and one occurrence of a blank value in column CBPV_Cat were replaced by {N} (400 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column CBPV, column CBPV_Notes was replaced by {negative} (402 records) and the {negative} in column CBPV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column CBPV, column CBPV_Notes was replaced by {not available} (335 records) and the blanks in column CBPV and column CBPV_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} and one occurrence of a blank value in column BQCV_Cat were replaced by {N} (2 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column BQCV, column BQCV_Notes was replaced by {negative} (2records) and the {negative} in column BQCV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column SBV, column SBV_Notes was replaced by {negative} (77 records) and the {negative} in column SBV was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
All occurrences of the character {-} in column SBV_Cat were replaced by {N} (77 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
All occurrences of the character {-} in column EFB_Cat were replaced by {N} (357 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column EFB, column EFB_Notes was replaced by {negative} (357 records) and the {negative} in column EFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column EFB, column EFB_Notes was replaced by {not available} (651 records) and the blanks in column EFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column EFB, column EFB_Notes was replaced by {ND} (3 records) and {ND} blanks in column EFB as well as blank values in column EFB_Cat were replaced by {NULL} to avoid having string or blank values in a data column that is supposed to contain real numbers or categories.
All occurrences of the character {-} in column AFB_Cat were replaced by {NULL} (300 records), as the use of a mathematical operator as a datum could potentially cause problems with database queries under particular circumstances.
In records, which contained the string {negative} in column AFB, column AFB_Notes was replaced by {negative} (300 records) and the {negative} in column AFB was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFB, column AFB_Notes was replaced by {not available} (696 records) and the blanks in column AFB and column EFB_Cat were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaApis, column NosemaApis_Notes was replaced by {negative} (687 records) and the {negative} in column NosemaApis was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaApis, column NosemaApis_Notes was replaced by {not available} (333 records) and the blanks in column NosemaApis were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {negative} (435 records) and the {negative} in column NosemaCeranae was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaCeranae, column NosemaCeranae_Notes was replaced by {not available} (333 records) and the blanks in column NosemaCeranaewere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in columns VarroaBees and VarroaInfestation, column Varroa_Notes was replaced by {ND} (2 records) and {ND} in columns VarroaBees and VarroaInfestation was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers or categories.
In records, which contain {ND} in column AFBcfu (312 records), column AFBcfu_Notes was replaced by {ND} and {ND} in column AFBcfu was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column AFBcfu, column AFBcfu_Notes was replaced by {not available} (696 records) and the blanks in column AFBcfu were replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {ND} in column NosemaSpores, column NosemaSpores_Notes was replaced by {ND} (426 records) and {ND} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain blank values in column NosemaSpores, column NosemaSpores_Notes was replaced by {not available} (333 records) and the blanks in column NosemaSporeswere replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contain {< 25000; <25000} in column NosemaSpores, column NosemaSpores_Notes was replaced by {<25000} (44 records) and {< 25000; <25000} in column NosemaSpores was replaced by {NULL} to avoid having blank values in a data column that is supposed to contain real numbers.
In records, which contained the string {negative} in column Malpighamoeba, column Malpighamoeba_Notes was replaced by {negative} (636 records) and the {negative} in column Malpighamoeba was replaced by {NULL} to avoid having text in a data column that is supposed to contain real numbers.

Unresolved issues

Overall, the B-GOOD datasets ingested into this EUPH dataset lack sufficient metadata and there are a range of other issues that limit compliance with the FAIR principles.
In general, columns are not sufficiently well described (e.g. it is unclear, which information is contained in columns MalpighamoebaCT and Malpighamoeba_Notes related to column Malpighamoeba; it is unclear, if the definitions provided for the attributes L, M, H - number of genome copies per bee -, which are used to define the dilution of DNA plasmids, only refer to Viral pathogens or to all pathogens). The provider should provide all information necessary to allow reuse of the data within the dataset.
For some columns no units are provided (e.g. AFB cfu), for other columns, the unit in which data is expressed is not explicitly stated and can only be assumed based on exclusion. The provider should explicitly state the units in columns containing data in order to avoid misunderstandings.
Some of the attributes used in the dataset are not explained (e.g. ND). The provider should define the meaning of all attributes used in the dataset.
Data comes in Microsoft Excel files, which occasionally contain nested comments or uncommented annotations (e.g. different background colour of cells) in single cells, which makes storage in relational databases difficult and automated processing and analysis impossible.
The table structure does not facilitate data standardisation, as standardisation would require all values measured with the same method to be stored int one single column and transformed to the same unit.
In column SampleID the values {GB_1; GB2; GB_3} are not unique. Each of them exists twice.
The significance of the string {ND} is unclear:

In column EFB_Cat (3 records), where column dataset = {Tier2 Field study A results 2022 for BEEP} and SampleID = {CDYTBDHK; DTUDJNAG; RYAYUTUG};
In column Varroa_Notes, where column dataset = {Varroa_Tier2 Field study results 2021 for WR} and SampleID = {ZUXHUFZP, BXCTGZBN CFO};
In columns AFBcfu_Notes and NosemaSpores_Notes;