, ,
• •

Dataset Report

Unique identifier:	BGDGN200.0.0
Title:	B-GOOD Genotyping SMR
Long title:	Dataset from the B-GOOD project, containing results for genotyping of wught SNPs associated with suppressed mite reproduction (SMR).
Status:	Quality Validated
Current Version:	v. 1.0
Published:	2025-03-17
Reviewed by:	Rubinigg Michael as Data scientist
Citation proposal:	B-GOOD Bee Health Data Portal 2025 Report of dataset B-GOOD Genotyping SMR, v. 1.0 [BGDGN200.0.0]. EU Pollinator Hub. [2026-05-16] app.pollinatorhub.eu

Compliance with FAIR* principles
Findable	Accessible	Interoperable	Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles

Data Quality

Good

This document is intended for use by collaborators of the EU Pollinator Hub and may be passed on with the express permission of the leader of the consortium and for the purpose determined by the leader of the consortium.

Table of content

Document history
1. Release
2. Revision
Abbreviations
Executive summary
Introduction
Material and methods
Data description
1. Dataset
2. Tables
  1. Data
References
Annex 1: Table column reports

Document history

Release

Version v. 1.0 released on 2025-03-17. Reviewed by Rubinigg Michael.

Revision

Table 1. List of revisions made to the document. Identifier of revision (No); date of revision (Date); description of revision (Description); reason for revision (Reason).

No	Date	Description	Reason
1	2025-03-17 00:03:00	Initial release.	n/a

Abbreviations

CSV

Comma-Separated Values

European Union

EUPH

EU Pollinator Hub

SMR

Suppressed Mite Reproduction

SNP

Single Nucleotide Polymorphism

UGENT

Universiteit Gent (Ghent University)

Executive summary

Data overview:

The data was published by De Smet L on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme. It contains data data on SMR genotyping for eight SNP as part of tier 1, tier 2 and tier 3 studies performed in 2022.

Data value:

The objectives of the B-GOOD project were: (1) Facilitate decision making for beekeepers and other stakeholders by establishing ready-to-use tools for operationalising the HSI; (2) Test, standardise and validate methods for measuring and reporting selected indicators affecting bee health; (3) Explore the various socio-economic and ecological factors beyond bee health; (4) Foster an EU community to collect and share knowledge related to honey bees and their environment; (5) Engender a lasting learning and innovation system (LIS); (6) Minimise the impact of biotic and abiotic stressors.

Data description:

n/a

Data application:

Currently, the data integrated from the B-GOOD Bee Health Data Portal contains major issues and does not comply with the FAIR Guiding Principles for scientific data management and stewardship applied on the EU Pollinator Hub. More descriptive information about the context, quality and condition, or characteristics of the data (e.g. protocols, measurement devices used, units of the captured data, or any other details about the study) must be provided. More metadata in the form of accurate and relevant attributes (*e.g. *metadata that describes the scope of the data has been described, any particularities or limitations about the data that other users should be aware of, specification of the date of generation/collection of the data, the lab conditions, who prepared the data, the parameter settings, the name and version of the software used, specification of whether it is raw or processed data, explanation of all variable names are explained if they are not self-explanatory) must be provided. The dataset requires major revision by the data provider.

Unresolved issues:

n/a

Introduction

n/a

Material and methods

Data acquisition

All raw data files were downloaded from the B-GOOD Bee Health Data Portal on 2024-09-26 18:16:43.

List of raw data obtained from the data provider.

File data.xlsxx accessed on 2024-09-26 18:16:43, provided by B-GOOD Bee Health Data Portal

Metadata was obtained from the dataset's web page.

Table 2. List of raw data and metadata files included in the dataset. Identifier of table row (No); name of the file (File); the type of the file (Type); file contains data (D); file contains metadata (M); date of upload of the file to the EU Pollinator Hub (Arrival); number of data points contained within the file (if applicable); uploaded file size.

No	File	Type	D	M	Arrival	Data points	File size
1	Data_PREP_MR_241104.csv	CSV - Comma seperated values	Yes	No	2024-11-04 14:11:52	7,686	61.39 KiB

Data preparation

All files in the zip-archives were extracted using File Explorer (Microsoft Corporation, version 22H2).

File results-genotyping-pooled-workers-b-good.xlsx was procesed with MS Excel (Microsoft Corporation, version 2409). The worksheet was exported in CSV format (utf-8 encoding) and imported into Notepad++ (version 8.7) where missing values were substituted by {NULL} using regular expressions and decimal points were converted to a baseline dot using regular expression ((?<=\d),(?=\d) replaced with .).

Data was then exported to the respective preparatory files and uploaded to the EU Pollinator Hub according to SOP-017 (Dataset integration.

Data validation

No data validation was performed.

Data analysis

No data analysis was performed.

Data description

Dataset

Table 3. Summary of tables belonging to the dataset. Table row identifier (No); name of the table (Table); description of the table (Description).

No	Table	Description
1	Data	Table containing the data.

Table 4. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
interactions.single.uid	BGDGN200.0.0
Title	B-GOOD Genotyping SMR
Long title	Dataset from the B-GOOD project, containing results for genotyping of wught SNPs associated with suppressed mite reproduction (SMR).
Target IRI	https://app.pollinatorhub.eu/dataset-discovery/BGDGN200.0.0
interactions.single.section-details.licence	CC BY-NC-ND 4.0
DOI	n/a
Created	2024-11-04
Published	2025-03-17
Contact	n/a
Keywords	Apis mellifera, SMR, SNP, Suppressed Mite Reproduction, genotyping, honey bee
Data collection years	2022
Regions, the data was collected in	n/a
Abstract	Dataset containing data on SMR genotyping for eight SNP, as part of tier 1, tier 2 and tier 3 studies performed in 2022. It was published by De Smet L (UGENT) on the B-GOOD Bee Health Data Portal as part of the B-GOOD project (grant agreement 817622), funded under the EU Horizon 2020 Research and Innovation Programme.

Table 5. Standardised metadata of the data provider B-GOOD Bee Health Data Portal. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	B-GOOD Bee Health Data Portal
Url
Acronym	B-GOOD
IRI	https://app.pollinatorhub.eu/data-providers/b-good-bee-health-data-portal
Address	https://b-good-project.eu
Country	Belgium
Contact	b-good-project.eu
Description	Project funded by the EU Horizon 2020 Research and Innovation Programme under grant agreement No 817622. Project website: https://b-good-project.eu

Tables

Data

Table 6. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Unique identifier	BGDGN200.DATAA496.0
Name	Data
Target IRI	https://app.pollinatorhub.eu/dataset-discovery/parts/BGDGN200.DATAA496.0
Table Type	File
Licence	CC BY-NC-ND 4.0
Description	Table containing the data.

Table containing the data.

Metadata

n/a

Table 7. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Column Description	Datatype	Descriptor	Unit
OVERALL	Not specified by the data provider. Presumably a record identifier.	String	pms:recordID [0.0.RCRDD344]	n/a
Nummer	Not specified by the data provider. Presumably a record identifier.	Integer number	Integer [0.0.NTGER313]	n/a
PLATE (LAYOUT LORE)	Identifier of the plate in which the analysis was performed.	String	plate ID [BGDGN200.0.PLTDA518]	n/a
WELL	Identifier of the well on the plate in which the analysis was performed.	String	Text [0.0.TEXTA315]	n/a
SAMPLE	Not specified by the data provider.	String	Text [0.0.TEXTA315]	n/a
VT_SNP1_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP2_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP3_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP4_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP5_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP6_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP7_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
VT_SNP8_COR	Not specified by the data provider.	Decimal number	DecimalNumber [0.0.DCMLN314]	n/a
sample ID	Identifier of the sample.	String	dwc:materialSampleID [0.0.MTRLS489]	n/a
season	Not specified by the data provider. Presumably the season in which the sample was taken.	String	pms:season [0.0.SSONA466]	n/a
sent from	Not specified by the data provider. Presumably the country in which the sample was taken.	String	Text [0.0.TEXTA315]	n/a
partner	Not specified by the data provider. Presumably the consortium partner who took the sample.	String	Text [0.0.TEXTA315]	n/a
general condition	Not specified by the data provider.	String	Text [0.0.TEXTA315]	n/a
comments	Annotations made by the data provider.	String	Text [0.0.TEXTA315]	n/a
Varroa/100 bees	Not specified by the data provider. Presumably the Varroa infestation rate in the honey bee sample.	Decimal number	pms:varroaInfestationOfAdultBees [0.0.VRRNF468]	mites (100 bees)-1
Country	Not specified by the data provider. Presumably the ISO 3166-1 apha-2 country code of the country in which the sample was taken.	String	iso-3166:alpha-2CountryCode [0.0.LPHCN4]	n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 8. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
OVERALL	1 - 3	209.0	2	104.75	212.5	315.25	418	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	366 ( 100.0% )
Nummer	1 - 2	35.6	1	17	33	53	80	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	79 ( 21.6% )
PLATE (LAYOUT LORE)	1 - 3	n/a	1	n/a	n/a	n/a	Sc2	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 1.6% )
WELL	2 - 3	n/a	A1	n/a	n/a	n/a	H9	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	94 ( 25.7% )
SAMPLE	4 - 9	n/a	4441	n/a	n/a	n/a	BG 823-22	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	366 ( 100.0% )
VT_SNP1_COR	1 - 11	48.5984319900	0	20.84268261	48.41520406	75.7150885	100	366	9 ( 2.5% )	3 ( 0.8% )	0 ( 0.0% )	342 ( 93.4% )
VT_SNP2_COR	1 - 11	54.6212167200	0	35.34125393	57.6181694	77.35657228	100	366	10 ( 2.7% )	18 ( 4.9% )	0 ( 0.0% )	332 ( 90.7% )
VT_SNP3_COR	1 - 11	41.6077282200	0	28.01503806	45.97294899	57.47347696	89.34654632	366	10 ( 2.7% )	29 ( 7.9% )	0 ( 0.0% )	329 ( 89.9% )
VT_SNP4_COR	1 - 11	54.7026960600	0	39.82283725	54.60962316	70.71881315	100	366	9 ( 2.5% )	1 ( 0.3% )	0 ( 0.0% )	347 ( 94.8% )
VT_SNP5_COR	1 - 11	45.6969905400	0	37.76512023	46.67206997	55.08743249	100	366	13 ( 3.6% )	4 ( 1.1% )	0 ( 0.0% )	346 ( 94.5% )
VT_SNP6_COR	1 - 11	42.7942635200	0	21.83436365	39.8599031	61.37781539	100	366	12 ( 3.3% )	19 ( 5.2% )	0 ( 0.0% )	329 ( 89.9% )
VT_SNP7_COR	1 - 11	6.8668240200	0	0	0	8.42908684	58.8089182	366	9 ( 2.5% )	215 ( 58.7% )	0 ( 0.0% )	144 ( 39.3% )
VT_SNP8_COR	1 - 11	45.8946702300	0	32.20754914	49.59756941	61.91043117	100	366	12 ( 3.3% )	31 ( 8.5% )	0 ( 0.0% )	310 ( 84.7% )
sample ID	1 - 17	n/a	11D2	n/a	n/a	n/a	ZXMYKHSX	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	361 ( 98.6% )
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.8% )
sent from	5 - 15	n/a	Belgium	n/a	n/a	n/a	United Kingd…	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	14 ( 3.8% )
partner	0 - 11	n/a	BSOUR	n/a	n/a	n/a	WR	366	41 ( 11.2% )	0 ( 0.0% )	0 ( 0.0% )	20 ( 5.5% )
general condition	0 - 65	n/a	0% bees comp…	n/a	n/a	n/a	very good -…	366	181 ( 49.5% )	0 ( 0.0% )	0 ( 0.0% )	13 ( 3.6% )
comments	6 - 7	n/a	Tier 1	n/a	n/a	n/a	pilot B	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 1.1% )
Varroa/100 bees	1 - 11	4.2480672700	0	0	0.84025322	3.5	176.9230769	366	6 ( 1.6% )	144 ( 39.3% )	0 ( 0.0% )	159 ( 43.4% )
Country	2 - 2	n/a	BE	n/a	n/a	n/a	UK	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	14 ( 3.8% )

Quality measures

Table 9. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
OVERALL	100.00%	100.00%	238	238
Nummer	100.00%	21.58%	6	67
PLATE (LAYOUT LORE)	100.00%	1.64%	2	Sc2
WELL	100.00%	25.68%	A6	F10
SAMPLE	100.00%	100.00%	4511	4511
VT_SNP1_COR	97.54%	93.44%	100	57.27545426
VT_SNP2_COR	97.27%	90.71%	0	48.11172943
VT_SNP3_COR	97.27%	89.89%	0	62.4212059
VT_SNP4_COR	97.54%	94.81%	n/a	78.57805578
VT_SNP5_COR	96.45%	94.54%	n/a	60.13570854
VT_SNP6_COR	96.72%	89.89%	0	14.5315944
VT_SNP7_COR	97.54%	39.34%	0	26.20519373
VT_SNP8_COR	96.72%	84.70%	0	20.50869886
sample ID	100.00%	98.63%	?	TPMANUPA
season	100.00%	0.82%	autumn	spring
sent from	100.00%	3.83%	The Netherlands	Greece
partner	88.80%	5.46%	MLU	T3 Greece
general condition	50.55%	3.55%	n/a	great - 75% bees complete + "wt" + a lot of drones
comments	100.00%	1.09%	Tier 3	pilot B
Varroa/100 bees	98.36%	43.44%	0	1.752464403
Country	100.00%	3.83%	NL	GR

Changes made to preparatory file

n/a

Changes made to data

Missing values (312 occurrences) were replaced by {NULL}.

Unresolved issues

For column OVERALL it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column Nummer it is unclear what it contains. The data provider is requested to make this information available.
For column SAMPLE it is unclear what it contains. The data provider is requested to make this information available.
For column VT_SNP1_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP2_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP3_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP4_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP5_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP6_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP7_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column VT_SNP8_COR it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column season it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column sent_from it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column partner it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column general_condition it is unclear what it contains. The data provider is requested to make this information available.
For column Varroa/100 bees it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.
For column Country it may be guessed but it is not explicitly stated by the data provider what it contains. The data provider is requested to make this information available.

References

De Smet L. 2024 SMR genotyping Tier1 Tier2 Tier3. B-GOOD Bee Health Data Portal. [2024-11-4] beehealthdata.org

Annex 1: Table column reports

Table: Data

Column: OVERALL

Table 10. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	OVERALL
Description	Not specified by the data provider. Presumably a record identifier.
Data type	String
Descriptor	pms:recordID [UID:0.0.RCRDD344]
Descriptor description	Unique sequence of integers associated with a record within a certain table.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.RCRDD344
Unit	n/a

Table 11. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
OVERALL	1 - 3	209.0	2	104.75	212.5	315.25	418	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	366 ( 100.0% )

Table 12. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
OVERALL	100.00%	100.00%	238	238

Continuous Data Distribution

Figure 1. Distribution of values in the column.

Outliers

Figure 2. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 3. Visualization of completeness of the data in the column.

Uniqueness

Figure 4. Visualization of uniqueness of the data in the column.

Column: Nummer

Table 13. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	Nummer
Description	Not specified by the data provider. Presumably a record identifier.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	n/a

Table 14. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
Nummer	1 - 2	35.6	1	17	33	53	80	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	79 ( 21.6% )

Table 15. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Nummer	100.00%	21.58%	6	67

Continuous Data Distribution

Figure 5. Distribution of values in the column.

Outliers

Figure 6. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 7. Visualization of completeness of the data in the column.

Uniqueness

Figure 8. Visualization of uniqueness of the data in the column.

Column: PLATE (LAYOUT LORE)

Table 16. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	PLATE (LAYOUT LORE)
Description	Identifier of the plate in which the analysis was performed.
Data type	String
Descriptor	plate ID [UID:BGDGN200.0.PLTDA518]
Descriptor description	Identifier of the plate in which the analysis was performed.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/BGDGN200.0.PLTDA518
Unit	n/a

Table 17. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
PLATE (LAYOUT LORE)	1 - 3	n/a	1	n/a	n/a	n/a	Sc2	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	6 ( 1.6% )

Table 18. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
PLATE (LAYOUT LORE)	100.00%	1.64%	2	Sc2

Data Distribution Top 20

Figure 9. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 10. Visualization of completeness of the data in the column.

Uniqueness

Figure 11. Visualization of uniqueness of the data in the column.

Column: WELL

Table 19. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	WELL
Description	Identifier of the well on the plate in which the analysis was performed.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 20. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
WELL	2 - 3	n/a	A1	n/a	n/a	n/a	H9	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	94 ( 25.7% )

Table 21. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
WELL	100.00%	25.68%	A6	F10

Completeness

Figure 12. Visualization of completeness of the data in the column.

Uniqueness

Figure 13. Visualization of uniqueness of the data in the column.

Column: SAMPLE

Table 22. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	SAMPLE
Description	Not specified by the data provider.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 23. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
SAMPLE	4 - 9	n/a	4441	n/a	n/a	n/a	BG 823-22	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	366 ( 100.0% )

Table 24. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
SAMPLE	100.00%	100.00%	4511	4511

Completeness

Figure 14. Visualization of completeness of the data in the column.

Uniqueness

Figure 15. Visualization of uniqueness of the data in the column.

Column: VT_SNP1_COR

Table 25. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP1_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 26. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP1_COR	1 - 11	48.5984319900	0	20.84268261	48.41520406	75.7150885	100	366	9 ( 2.5% )	3 ( 0.8% )	0 ( 0.0% )	342 ( 93.4% )

Table 27. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP1_COR	97.54%	93.44%	100	57.27545426

Continuous Data Distribution

Figure 16. Distribution of values in the column.

Outliers

Figure 17. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 18. Visualization of completeness of the data in the column.

Uniqueness

Figure 19. Visualization of uniqueness of the data in the column.

Column: VT_SNP2_COR

Table 28. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP2_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 29. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP2_COR	1 - 11	54.6212167200	0	35.34125393	57.6181694	77.35657228	100	366	10 ( 2.7% )	18 ( 4.9% )	0 ( 0.0% )	332 ( 90.7% )

Table 30. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP2_COR	97.27%	90.71%	0	48.11172943

Continuous Data Distribution

Figure 20. Distribution of values in the column.

Outliers

Figure 21. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 22. Visualization of completeness of the data in the column.

Uniqueness

Figure 23. Visualization of uniqueness of the data in the column.

Column: VT_SNP3_COR

Table 31. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP3_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 32. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP3_COR	1 - 11	41.6077282200	0	28.01503806	45.97294899	57.47347696	89.34654632	366	10 ( 2.7% )	29 ( 7.9% )	0 ( 0.0% )	329 ( 89.9% )

Table 33. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP3_COR	97.27%	89.89%	0	62.4212059

Continuous Data Distribution

Figure 24. Distribution of values in the column.

Outliers

Figure 25. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 26. Visualization of completeness of the data in the column.

Uniqueness

Figure 27. Visualization of uniqueness of the data in the column.

Column: VT_SNP4_COR

Table 34. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP4_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 35. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP4_COR	1 - 11	54.7026960600	0	39.82283725	54.60962316	70.71881315	100	366	9 ( 2.5% )	1 ( 0.3% )	0 ( 0.0% )	347 ( 94.8% )

Table 36. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP4_COR	97.54%	94.81%	n/a	78.57805578

Continuous Data Distribution

Figure 28. Distribution of values in the column.

Outliers

Figure 29. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 30. Visualization of completeness of the data in the column.

Uniqueness

Figure 31. Visualization of uniqueness of the data in the column.

Column: VT_SNP5_COR

Table 37. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP5_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 38. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP5_COR	1 - 11	45.6969905400	0	37.76512023	46.67206997	55.08743249	100	366	13 ( 3.6% )	4 ( 1.1% )	0 ( 0.0% )	346 ( 94.5% )

Table 39. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP5_COR	96.45%	94.54%	n/a	60.13570854

Continuous Data Distribution

Figure 32. Distribution of values in the column.

Outliers

Figure 33. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 34. Visualization of completeness of the data in the column.

Uniqueness

Figure 35. Visualization of uniqueness of the data in the column.

Column: VT_SNP6_COR

Table 40. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP6_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 41. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP6_COR	1 - 11	42.7942635200	0	21.83436365	39.8599031	61.37781539	100	366	12 ( 3.3% )	19 ( 5.2% )	0 ( 0.0% )	329 ( 89.9% )

Table 42. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP6_COR	96.72%	89.89%	0	14.5315944

Continuous Data Distribution

Figure 36. Distribution of values in the column.

Outliers

Figure 37. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 38. Visualization of completeness of the data in the column.

Uniqueness

Figure 39. Visualization of uniqueness of the data in the column.

Column: VT_SNP7_COR

Table 43. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP7_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 44. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP7_COR	1 - 11	6.8668240200	0	0	0	8.42908684	58.8089182	366	9 ( 2.5% )	215 ( 58.7% )	0 ( 0.0% )	144 ( 39.3% )

Table 45. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP7_COR	97.54%	39.34%	0	26.20519373

Continuous Data Distribution

Figure 40. Distribution of values in the column.

Completeness

Figure 41. Visualization of completeness of the data in the column.

Uniqueness

Figure 42. Visualization of uniqueness of the data in the column.

Column: VT_SNP8_COR

Table 46. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	VT_SNP8_COR
Description	Not specified by the data provider.
Data type	Decimal number
Descriptor	DecimalNumber [UID:0.0.DCMLN314]
Descriptor description	Any of the rational or irrational numbers.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit	n/a

Table 47. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
VT_SNP8_COR	1 - 11	45.8946702300	0	32.20754914	49.59756941	61.91043117	100	366	12 ( 3.3% )	31 ( 8.5% )	0 ( 0.0% )	310 ( 84.7% )

Table 48. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
VT_SNP8_COR	96.72%	84.70%	0	20.50869886

Continuous Data Distribution

Figure 43. Distribution of values in the column.

Outliers

Figure 44. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 45. Visualization of completeness of the data in the column.

Uniqueness

Figure 46. Visualization of uniqueness of the data in the column.

Column: sample ID

Table 49. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	sample ID
Description	Identifier of the sample.
Data type	String
Descriptor	dwc:materialSampleID [UID:0.0.MTRLS489]
Descriptor description	An identifier for a material sample.
Descriptor target IRI	http://rs.tdwg.org/dwc/terms/materialSampleID
Unit	n/a

Table 50. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
sample ID	1 - 17	n/a	11D2	n/a	n/a	n/a	ZXMYKHSX	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	361 ( 98.6% )

Table 51. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
sample ID	100.00%	98.63%	?	TPMANUPA

Completeness

Figure 47. Visualization of completeness of the data in the column.

Uniqueness

Figure 48. Visualization of uniqueness of the data in the column.

Column: season

Table 52. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	season
Description	Not specified by the data provider. Presumably the season in which the sample was taken.
Data type	String
Descriptor	pms:season [UID:0.0.SSONA466]
Descriptor description	[...] any of the four arbitrary divisions of the year, characterized chiefly by differences in temperature, precipitation, amount of daylight, and plant growth [...]
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.SSONA466
Unit	n/a

Table 53. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
season	6 - 6	n/a	autumn	n/a	n/a	n/a	summer	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	3 ( 0.8% )

Table 54. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
season	100.00%	0.82%	autumn	spring

Data Distribution Top 20

Figure 49. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 50. Visualization of completeness of the data in the column.

Uniqueness

Figure 51. Visualization of uniqueness of the data in the column.

Column: sent from

Table 55. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	sent from
Description	Not specified by the data provider. Presumably the country in which the sample was taken.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 56. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
sent from	5 - 15	n/a	Belgium	n/a	n/a	n/a	United Kingd…	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	14 ( 3.8% )

Table 57. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
sent from	100.00%	3.83%	The Netherlands	Greece

Data Distribution Top 20

Figure 52. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 53. Visualization of completeness of the data in the column.

Uniqueness

Figure 54. Visualization of uniqueness of the data in the column.

Column: partner

Table 58. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	partner
Description	Not specified by the data provider. Presumably the consortium partner who took the sample.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 59. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
partner	0 - 11	n/a	BSOUR	n/a	n/a	n/a	WR	366	41 ( 11.2% )	0 ( 0.0% )	0 ( 0.0% )	20 ( 5.5% )

Table 60. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
partner	88.80%	5.46%	MLU	T3 Greece

Completeness

Figure 55. Visualization of completeness of the data in the column.

Uniqueness

Figure 56. Visualization of uniqueness of the data in the column.

Column: general condition

Table 61. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	general condition
Description	Not specified by the data provider.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 62. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
general condition	0 - 65	n/a	0% bees comp…	n/a	n/a	n/a	very good -…	366	181 ( 49.5% )	0 ( 0.0% )	0 ( 0.0% )	13 ( 3.6% )

Table 63. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
general condition	50.55%	3.55%	n/a	great - 75% bees complete + "wt" + a lot of drones

Data Distribution Top 20

Figure 57. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 58. Visualization of completeness of the data in the column.

Uniqueness

Figure 59. Visualization of uniqueness of the data in the column.

Column: comments

Table 64. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	comments
Description	Annotations made by the data provider.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 65. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
comments	6 - 7	n/a	Tier 1	n/a	n/a	n/a	pilot B	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	4 ( 1.1% )

Table 66. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
comments	100.00%	1.09%	Tier 3	pilot B

Data Distribution Top 20

Figure 60. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 61. Visualization of completeness of the data in the column.

Uniqueness

Figure 62. Visualization of uniqueness of the data in the column.

Column: Varroa/100 bees

Table 67. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	Varroa/100 bees
Description	Not specified by the data provider. Presumably the Varroa infestation rate in the honey bee sample.
Data type	Decimal number
Descriptor	pms:varroaInfestationOfAdultBees [UID:0.0.VRRNF468]
Descriptor description	The quantity infestation rate of adult honey bee colonies with Varroa mites (Varroa destructor), measured by dislodging Varroa mites from adult honey bees, expressed in number of Varroa mites per unit of honey bees.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.VRRNF468
Unit	mites (100 bees)-1

Table 68. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
Varroa/100 bees	1 - 11	4.2480672700	0	0	0.84025322	3.5	176.9230769	366	6 ( 1.6% )	144 ( 39.3% )	0 ( 0.0% )	159 ( 43.4% )

Table 69. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Varroa/100 bees	98.36%	43.44%	0	1.752464403

Continuous Data Distribution

Figure 63. Distribution of values in the column.

Outliers

Figure 64. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 65. Visualization of completeness of the data in the column.

Uniqueness

Figure 66. Visualization of uniqueness of the data in the column.

Column: Country

Table 70. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	Country
Description	Not specified by the data provider. Presumably the ISO 3166-1 apha-2 country code of the country in which the sample was taken.
Data type	String
Descriptor	iso-3166:alpha-2CountryCode [UID:0.0.LPHCN4]
Descriptor description	A two-letter code that represents a country name, recommended as the general purpose code.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.LPHCN4
Unit	n/a

Table 71. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
Country	2 - 2	n/a	BE	n/a	n/a	n/a	UK	366	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	14 ( 3.8% )

Table 72. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
Country	100.00%	3.83%	NL	GR

Data Distribution Top 20

Figure 67. Distribution of 20 most common values, from highest to lowest.

Completeness

Figure 68. Visualization of completeness of the data in the column.

Uniqueness

Figure 69. Visualization of uniqueness of the data in the column.

Document history

Release

Revision

Abbreviations

Executive summary

Introduction

Material and methods

Data acquisition

Data preparation

Data validation

Data analysis

Data description

Dataset

Tables

Data

Metadata

Descriptive measures

Quality measures

Changes made to preparatory file

Changes made to data

Unresolved issues

References

Annex 1: Table column reports

Table: Data

Column: ﻿OVERALL

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: Nummer

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: PLATE (LAYOUT LORE)

Data Distribution Top 20

Completeness

Uniqueness

Column: WELL

Completeness

Uniqueness

Column: SAMPLE

Completeness

Uniqueness

Column: VT_SNP1_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP2_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP3_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP4_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP5_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP6_COR

Continuous Data Distribution

Outliers

Completeness

Uniqueness

Column: VT_SNP7_COR

Continuous Data Distribution

Completeness

Uniqueness

Column: VT_SNP8_COR

Continuous Data Distribution

Column: OVERALL