EU Pollinator Hub

, ,

Dataset Report
Unique identifier: VRRMN16.0.0
Title: Varroa monitoring Austria
Long title: Results from a Varroa monitoring program in various apiaries in Austria
Status: Quality Validated
Current Version: v. 1.0
Published: 2023-12-13
Reviewed by:
Citation proposal:
Biene Österreich – Imkereidachverband 2023 Report of dataset Varroa monitoring Austria, v. 1.0 [VRRMN16.0.0]. EU Pollinator Hub. [2025-11-18] app.pollinatorhub.eu
Compliance with FAIR* principles
Findable
Accessible
Interoperable
Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles
Data Quality
Under evaluation

Document history

Release

Version v. 1.0 released on 2023-12-13.

Revision

Table 1. List of revisions made to the document. Identifier of revision (No); date of revision (Date); description of revision (Description); reason for revision (Reason).
No Date Description Reason
1 2023-12-13 00:12:00 Initial release. n/a

Abbreviations

No abbreviations.

Executive summary

Data overview:

n/a

Data value:

n/a

Data description:

n/a

interactions.single.section-about.data-overview-application:

n/a

Unresolved issues:

n/a

Introduction

n/a

Material and methods

Data acquisition

n/a
Table 2. List of raw data and metadata files included in the dataset. Identifier of table row (No); name of the file (File); the type of the file (Type); file contains data (D); file contains metadata (M); date of upload of the file to the EU Pollinator Hub (Arrival); number of data points contained within the file (if applicable); uploaded file size.
No File Type D M Arrival Data points File size
1 hive.csv CSV - Comma seperated values Yes No 2023-12-09 09:12:40 4,232 21.50 KiB
2 station.csv CSV - Comma seperated values Yes No 2023-12-09 09:12:07 365 2.78 KiB
3 user.csv CSV - Comma seperated values Yes No 2023-12-09 09:12:43 198 906.00 B
4 varroa_sampling.csv CSV - Comma seperated values Yes No 2023-12-09 09:12:09 77,868 498.51 KiB
5 weather.csv CSV - Comma seperated values Yes No 2025-08-21 08:08:20 14,583,965 209.89 MiB
6 yard.csv CSV - Comma seperated values Yes No 2023-12-09 10:12:52 968 5.22 KiB

Data preparation

n/a

Data validation

n/a

Data analysis

n/a

Data description

Dataset

Table 3. Summary of tables belonging to the dataset. Table row identifier (No); name of the table (Table); description of the table (Description).
No Table Description
1 hive The table maps the relationship between beekeepers (anonymised users of the web application which is used by beekeepers to provide…
2 station The table contains geographic information of the NOAA weather stations contained in the dataset. There are 73 unique weather stations…
3 user The table contains the number of samples (Varroa infestation data of beehives) that were provided by each single user. The…
4 Varroa sampling This table contains data on Varroa infestation levels (the number of varroa mites found in the sampling event) measured in…
5 weather The combined hourly weather data collected from 73 weather stations around Austria for the 8 years, total ~1.3 million rows.…
6 yard The table contains data on the apiaries (yards) at which the beehives for which the Varroa samples were obtained, were…
Table 4. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
interactions.single.uid VRRMN16.0.0
Title Varroa monitoring Austria
Long title Results from a Varroa monitoring program in various apiaries in Austria
Target IRI https://app.pollinatorhub.eu/dataset-discovery/VRRMN16.0.0
interactions.single.section-details.licence CC BY-SA 4.0
DOI n/a
Created 2022-03-14
Published 2023-12-13
Contact n/a
Keywords Austria, Varroa destructor, monitoring
Data collection years 2012-2020
Regions, the data was collected in Österreich
Abstract

An eight-year survey of Varroa destructor infestation rates of western honey bee (Apis mellifera) colonies across Austria and the spatial dimension, temporal dimension and weather factors that impact these infestation rates.

Table 5. Standardised metadata of the data provider Biene Österreich – Imkereidachverband. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Name Biene Österreich – Imkereidachverband
Url
Acronym
IRI https://app.pollinatorhub.eu/data-providers/boe
Address Georg-Coch Platz3/11a, 1010 Wien, Austria
Country Austria
Contact Georg-Coch Platz 3/11a, 1010 Wien www.biene-oesterreich.at office@biene-oesterreich.at
Description

The Austrian Beekeepers Federation (, Biene Österreich-Imkereidachverband) is the umbrella organisation of the two largest beekeeping associations in Austria, the Austrian Beekepers Association (ÖIB, Österreichischer Imkerbund) and the Austrian Professional Beekeepers Association (ÖEIB, Österreichischer Erwerbsimkerbund).

Tables

hive

Table 6. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.HIVEA141.0
Name hive
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.HIVEA141.0
Table Type File
Licence CC BY-SA 4.0
Description

The table maps the relationship between beekeepers (anonymised users of the web application which is used by beekeepers to provide Varroa infestation data in their bee yards) and the hives for which they reported the Varroa infestation levels. There are 99 unique user_id’s and 2116 hive id’s.

The table maps the relationship between beekeepers (anonymised users of the web application which is used by beekeepers to provide Varroa infestation data in their bee yards) and the hives for which they reported the Varroa infestation levels. There are 99 unique user_id’s and 2116 hive id’s.

Metadata

n/a
Table 7. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
hive_id

The hive identifier

Integer number pms:beehiveID [0.0.HVEID216]

n/a

user_id

The user identifier

Integer number pms:userID [0.0.SERID483]

n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 8. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
hive_id 1 - 4 1,187.7 1 562.25 1,187.5 1,788.75 2,501 2,116 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,116 ( 100.0% )
user_id 2 - 4 8,001.2 10 8,383 8,418 8,509 9,128 2,116 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 103 ( 4.9% )

Quality measures

Table 9. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
hive_id
100.00%
100.00%
1 1
user_id
100.00%
4.87%
8418 8310

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

station

Table 10. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.STTNA142.0
Name station
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.STTNA142.0
Table Type File
Licence CC BY-SA 4.0
Description

The table contains geographic information of the NOAA weather stations contained in the dataset. There are 73 unique weather stations in this dataset. These are contained between latitudes of 46.617 and 48.683 and longitudes of 9.617 and 16.600. Because of the public availability of this data, these coordinates are not blurred. These stations can be found between elevations of 153 meters and 1210 meters above sea level. 90% of the yard elevations are within 300 meters of the weather station elevation. This means there is between a 0 and 6 degrees celsius difference in air temperature which can be calculated with the data provided for accuracy in analysis.

The table contains geographic information of the NOAA weather stations contained in the dataset. There are 73 unique weather stations in this dataset. These are contained between latitudes of 46.617 and 48.683 and longitudes of 9.617 and 16.600. Because of the public availability of this data, these coordinates are not blurred. These stations can be found between elevations of 153 meters and 1210 meters above sea level. 90% of the yard elevations are within 300 meters of the weather station elevation. This means there is between a 0 and 6 degrees celsius difference in air temperature which can be calculated with the data provided for accuracy in analysis.

Metadata

n/a
Table 11. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
station_id

The NOAA weather station identifier

Integer number Integer [0.0.NTGER313]

n/a

station_title

The NOAA weather Station Name

String Text [0.0.TEXTA315]

n/a

latitude

Latitude coordinates of the station in decimal degrees in WGS84 standard.

Decimal number dwc:decimalLatitude [0.0.LTTDE333]

°

longitude

Longitude coordinates of the station in decimal degrees in WGS84 standard.

Decimal number dwc:decimalLongitude [0.0.LNGTD332]

°

station_elevation

Meters above Sea Level

Decimal number pms:heightAboveMeanSeaLevel [0.0.HGHTB393]

m

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 12. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_id 6 - 6 112,055.1 110,010 110,825 112,200 113,030 113,900 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )
station_title 4 - 26 n/a ALBERSCHWEND… n/a n/a n/a ZELTWEG/AUTO… 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )
latitude 2 - 6 47.5721 46.617 47.075 47.45 48.175 48.683 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 56 ( 76.7% )
longitude 2 - 6 14.4541 9.617 13.558 14.744 15.7665 16.6 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 66 ( 90.4% )
station_elevation 3 - 6 536.72 153 306.05 486 715.7 1,209.7 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )

Quality measures

Table 13. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_id
100.00%
100.00%
110010 110010
station_title
100.00%
100.00%
WOLFSEGG WOLFSEGG
latitude
100.00%
76.71%
48.567 48.1
longitude
100.00%
90.41%
16.367 13.667
station_elevation
100.00%
100.00%
615.6 615.6

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

user

Table 14. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.USERA143.0
Name user
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.USERA143.0
Table Type File
Licence CC BY-SA 4.0
Description

The table contains the number of samples (Varroa infestation data of beehives) that were provided by each single user. The total number of samples collected is 11,124. It is important to note that there is a strong bias in the origin of the samples. A single user-provided 27% of the samples in this dataset. About 53% of the samples are derived from 22 users who each provided 100 to 999 samples. 18% of the samples are from a group of 56 users who provided 10 to 99 samples. 1% of the samples were given by 24 users who had entered less than 10 samples each.

The table contains the number of samples (Varroa infestation data of beehives) that were provided by each single user. The total number of samples collected is 11,124. It is important to note that there is a strong bias in the origin of the samples. A single user-provided 27% of the samples in this dataset. About 53% of the samples are derived from 22 users who each provided 100 to 999 samples. 18% of the samples are from a group of 56 users who provided 10 to 99 samples. 1% of the samples were given by 24 users who had entered less than 10 samples each.

Metadata

n/a
Table 15. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
user_id

The user identifier

Integer number pms:userID [0.0.SERID483]

n/a

samples

Total numbers of samples provided by a user

Integer number Integer [0.0.NTGER313]

no.

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 16. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
user_id 2 - 4 8,403.9 10 8,354 8,411 8,617 9,128 99 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 99 ( 100.0% )
samples 1 - 4 111.6 1 10 25 96 3,058 99 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 65 ( 65.7% )

Quality measures

Table 17. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
user_id
100.00%
100.00%
10 10
samples
100.00%
65.66%
10 3058

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

Varroa sampling

Table 18. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.VRRSM144.0
Name Varroa sampling
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.VRRSM144.0
Table Type File
Licence CC BY-SA 4.0
Description

This table contains data on Varroa infestation levels (the number of varroa mites found in the sampling event) measured in individual hives on a given apiary at a given time with a given quality standard. Varroa samples were collected from citizen and mined data using 3 standard method sources between the years 2012-2020. This data contains 11124 varroa sampling events. Roughly 21% of these events record zero mites present. The highest number of mites present in a single sampling event is 5016. Sampling events are collected from 04/02/12 to 11/11/20, and last on average 7.2 days each (range: 1.0-23.8 days). Roughly 75% of Varroa Sampling events occur between 3 and 9 days.

Data on mite infestation levels were collected from 3 sources by a standard method - natural mite falls - from 2012 to 2020, mainly in the spring, early summer, and late summer. Data were collected from 3 different sources of differing quality. Data from the highest quality, described as quality_control=2, was examined with the BeeVS diagnostic system (Apisfero, Turin, Italy), which consists of a high-resolution scanner to take a picture of the samples (sticky boards placed under the brood nest of colonies) and cloud-based software used to count the number of mites on the sticky boards. Data from the intermediate source is described as quality_control=1 and were examined manually by a trained group. Data from the poorest quality source is described as quality_control=0 and were examined manually by untrained individuals according to a classification scheme. Data was entered via a web terminal by whomever analyzed the sample. The software vetted the data for plausibility (rejection of values that exceed 100 mites/day) and completeness (rejection of values that did not fall between a 3 day and 21-day measuring interval). Data exceeding these limits, which can be found in the data set, has been imported from external resources and has been approved by the supervisor. The data collected by untrained individuals were checked by the supervisor for plausibility.

From 2012 to 2016 the project was only implemented in the Austrian province of Styria, where approximately 3500 beekeepers supervised 53000 to 56000 honeybee colonies. In 2017 the crowdsourcing initiative was extended to all nine Austrian provinces, consisting of 28032 to 30237 beekeepers and 329402 to 390607 honeybee colonies in their care.

The total number of samples collected is 11124. 4033 (36%) were medium quality samples (QC=1) and 3267 (29%) were high quality samples (QC=2).

The varroa survey dataset includes 99 users (beekeepers), 242 bee yards (apiaries), and 2,116 hives from the nine Austrian provinces for a total of 11124 records pertaining to varroa infestation.

This table contains data on Varroa infestation levels (the number of varroa mites found in the sampling event) measured in individual hives on a given apiary at a given time with a given quality standard. Varroa samples were collected from citizen and mined data using 3 standard method sources between the years 2012-2020. This data contains 11124 varroa sampling events. Roughly 21% of these events record zero mites present. The highest number of mites present in a single sampling event is 5016. Sampling events are collected from 04/02/12 to 11/11/20, and last on average 7.2 days each (range: 1.0-23.8 days). Roughly 75% of Varroa Sampling events occur between 3 and 9 days.

Data on mite infestation levels were collected from 3 sources by a standard method - natural mite falls - from 2012 to 2020, mainly in the spring, early summer, and late summer. Data were collected from 3 different sources of differing quality. Data from the highest quality, described as quality_control=2, was examined with the BeeVS diagnostic system (Apisfero, Turin, Italy), which consists of a high-resolution scanner to take a picture of the samples (sticky boards placed under the brood nest of colonies) and cloud-based software used to count the number of mites on the sticky boards. Data from the intermediate source is described as quality_control=1 and were examined manually by a trained group. Data from the poorest quality source is described as quality_control=0 and were examined manually by untrained individuals according to a classification scheme. Data was entered via a web terminal by whomever analyzed the sample. The software vetted the data for plausibility (rejection of values that exceed 100 mites/day) and completeness (rejection of values that did not fall between a 3 day and 21-day measuring interval). Data exceeding these limits, which can be found in the data set, has been imported from external resources and has been approved by the supervisor. The data collected by untrained individuals were checked by the supervisor for plausibility.

From 2012 to 2016 the project was only implemented in the Austrian province of Styria, where approximately 3500 beekeepers supervised 53000 to 56000 honeybee colonies. In 2017 the crowdsourcing initiative was extended to all nine Austrian provinces, consisting of 28032 to 30237 beekeepers and 329402 to 390607 honeybee colonies in their care.

The total number of samples collected is 11124. 4033 (36%) were medium quality samples (QC=1) and 3267 (29%) were high quality samples (QC=2).

The varroa survey dataset includes 99 users (beekeepers), 242 bee yards (apiaries), and 2,116 hives from the nine Austrian provinces for a total of 11124 records pertaining to varroa infestation.

Metadata

n/a
Table 19. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
sampling_id

The sampling event identifier

String dwc:materialSampleID [0.0.MTRLS489]

n/a

date_from

The first date (year, month, day) and time (hours, minutes) of the sampling event

String Text [0.0.TEXTA315]

n/a

date_to

The final date (year, month, day) and time (hours, minutes) of the sampling event

String Text [0.0.TEXTA315]

n/a

varroa_count

The number of varroa mites found in the sampling event

Decimal number pms:naturalVarroaMiteFall [0.0.NMBRF371]

mites d-1

quality_control

The quality level of the sample collected

  • 2 = examined with the BeeVS diagnostic system
  • 1 = examined manually by a trained group.
  • 0 = examined manually by untrained individuals
Integer number Integer [0.0.NTGER313]

n/a

hive_id

The hive identifier

String pms:beehiveID [0.0.HVEID216]

n/a

yard_id

The yard identifier

String pms:apiaryID [0.0.PRYID342]

n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 20. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
sampling_id 1 - 5 5,788.6 1 2,953.25 5,828.5 8,625.75 11,427 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 11,124 ( 100.0% )
date_from 11 - 14 n/a 1/1/19 13:00 n/a n/a n/a 9/9/18 15:30 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 1,903 ( 17.1% )
date_to 11 - 14 n/a 1/12/18 16:0… n/a n/a n/a 9/9/20 18:00 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,227 ( 20.0% )
varroa_count 1 - 4 28.3 0 1 4 16 5,016 11,124 0 ( 0.0% ) 2,301 ( 20.7% ) 0 ( 0.0% ) 387 ( 3.5% )
quality_control 1 - 1 0.9 0 0 1 2 2 11,124 0 ( 0.0% ) 3,824 ( 34.4% ) 0 ( 0.0% ) 3 ( 0.0% )
hive_id 1 - 4 820.6 1 289 715 1,199 2,501 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,116 ( 19.0% )
yard_id 2 - 3 370.5 73 194 391 522 664 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 242 ( 2.2% )

Quality measures

Table 21. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
sampling_id
100.00%
100.00%
1 1
date_from
100.00%
17.11%
4/4/20 8:00 11/17/17 15:20
date_to
100.00%
20.02%
4/4/20 11:00 4/20/17 17:20
varroa_count
100.00%
3.48%
0 700
quality_control
100.00%
0.03%
1 2
hive_id
100.00%
19.02%
943 431
yard_id
100.00%
2.18%
87 404

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

weather

Table 22. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.WTHER145.0
Name weather
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.WTHER145.0
Table Type File
Licence CC BY-SA 4.0
Description

The combined hourly weather data collected from 73 weather stations around Austria for the 8 years, total ~1.3 million rows.

Weather data is derived from NOAA. using Integrated Surface Data Lite (ISD-Lite). The ISD-Lite data contains a formatted subset of the complete Integrated Surface Data (ISD) for a number of elements. The data are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program according to WMO Resolution 40 (Cg-XII). The data of the Austria weather stations have been filtered from: ftp://ftp.ncei.noaa.gov/pub/data/noaa/ by unique USAF, WBAN, and year. The hourly values of temperature, dew point, wind speed, pressure, and precipitation have been maintained in the data set and preserved in original metric measurements. Each bee yard has been matched to the closest weather station. The dataset includes 73 weather stations, 2012-2020 hourly values, and 1.3 million records.

The combined hourly weather data collected from 73 weather stations around Austria for the 8 years, total ~1.3 million rows.

Weather data is derived from NOAA. using Integrated Surface Data Lite (ISD-Lite). The ISD-Lite data contains a formatted subset of the complete Integrated Surface Data (ISD) for a number of elements. The data are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program according to WMO Resolution 40 (Cg-XII). The data of the Austria weather stations have been filtered from: ftp://ftp.ncei.noaa.gov/pub/data/noaa/ by unique USAF, WBAN, and year. The hourly values of temperature, dew point, wind speed, pressure, and precipitation have been maintained in the data set and preserved in original metric measurements. Each bee yard has been matched to the closest weather station. The dataset includes 73 weather stations, 2012-2020 hourly values, and 1.3 million records.

Metadata

n/a
Table 23. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
station_id
Integer number pms:recordID [0.0.RCRDD344]

n/a

date
Date iso-8601:calendarDate [0.0.DATEA317]

n/a

hour
String iso-8601:clock hour [0.0.HRFDY386]

n/a

air_temp
Decimal number DecimalNumber [0.0.DCMLN314]

n/a

dew_point
Decimal number DecimalNumber [0.0.DCMLN314]

n/a

pressure
Decimal number pms:atmosphericPressure [0.0.TMSPH396]
wind_dir
Integer number pms:windDirection [0.0.WNDDR475]

°

wind_spd
Decimal number pms:windSpeed [0.0.WNDSP474]

m s-1

sky_cond
Integer number Integer [0.0.NTGER313]

n/a

precip_1hr
Decimal number DecimalNumber [0.0.DCMLN314]

n/a

precip_6hr
Decimal number DecimalNumber [0.0.DCMLN314]

n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 24. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_id 6 - 6 111,896.2 110,010 110,600 112,130 112,960 113,900 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 72 ( 0.0% )
date 10 - 10 2,018.0 2012-01-01 2,017 2,018 2,019 2020-12-07 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 3,244 ( 0.2% )
hour 14 - 14 n/a 00:00:00+00:… n/a n/a n/a 23:00:00+00:… 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 24 ( 0.0% )
air_temp 1 - 5 10.15 -23 3.4 10 16.6 38.6 1,325,815 3,907 ( 0.3% ) 8,733 ( 0.7% ) 0 ( 0.0% ) 593 ( 0.0% )
dew_point 1 - 5 5.28 -31 0.2 5.4 11.2 27.5 1,325,815 4,974 ( 0.4% ) 8,678 ( 0.7% ) 0 ( 0.0% ) 503 ( 0.0% )
pressure 3 - 6 1,017.71 943.9 1,012.6 1,017.5 1,022.8 1,050.9 1,325,815 228,763 ( 17.3% ) 0 ( 0.0% ) 0 ( 0.0% ) 714 ( 0.1% )
wind_dir 1 - 3 194.9 0 90 220 290 360 1,325,815 53,660 ( 4.0% ) 104,854 ( 7.9% ) 0 ( 0.0% ) 38 ( 0.0% )
wind_spd 1 - 4 2.24 0 1 2 3 28 1,325,815 102,751 ( 7.8% ) 61,122 ( 4.6% ) 0 ( 0.0% ) 60 ( 0.0% )
sky_cond 1 - 1 5.0 0 2 6 8 9 1,325,815 1,175,710 ( 88.7% ) 18,374 ( 1.4% ) 0 ( 0.0% ) 11 ( 0.0% )
precip_1hr 1 - 4 0.23 -1 -1 0.1 0.5 69 1,325,815 1,148,579 ( 86.6% ) 34,978 ( 2.6% ) 0 ( 0.0% ) 65 ( 0.0% )
precip_6hr 1 - 4 0.94 -1 0 0 0.3 99 1,325,815 1,250,100 ( 94.3% ) 47,825 ( 3.6% ) 0 ( 0.0% ) 91 ( 0.0% )

Quality measures

Table 25. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_id
100.00%
0.01%
112400 112280
date
100.00%
0.24%
2017-12-01 2015-04-08
hour
100.00%
0.00%
06:00:00+00:00 02:00:00+00:00
air_temp
99.71%
0.04%
14 37.5
dew_point
99.62%
0.04%
0 25.9
pressure
82.75%
0.05%
null 976.8
wind_dir
95.95%
0.00%
360 10
wind_spd
92.25%
0.00%
1 24
sky_cond
11.32%
0.00%
8 9
precip_1hr
13.37%
0.00%
null 69
precip_6hr
5.71%
0.01%
null 49

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

yard

Table 26. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).
Parameter Content
Unique identifier VRRMN16.YARDA146.0
Name yard
Target IRI https://app.pollinatorhub.eu/dataset-discovery/parts/VRRMN16.YARDA146.0
Table Type File
Licence CC BY-SA 4.0
Description

The table contains data on the apiaries (yards) at which the beehives for which the Varroa samples were obtained, were kept at the time of sampling. There are 242 unique yard_id’s. Yards connect to the weather files by the closest weather station. Each bee yard has been matched to the closest weather station.

The table contains data on the apiaries (yards) at which the beehives for which the Varroa samples were obtained, were kept at the time of sampling. There are 242 unique yard_id’s. Yards connect to the weather files by the closest weather station. Each bee yard has been matched to the closest weather station.

Metadata

n/a
Table 27. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Column Description Datatype Descriptor Unit
yard_id

The yard identifier

Integer number Integer [0.0.NTGER313]

n/a

elevation

Meters above Sea Level rounded to the nearest meter

Integer number pms:heightAboveMeanSeaLevel [0.0.HGHTB393]

m

nuts

NUTS is a geocode standard for referencing the administrative divisions of countries for statistical purposes.

  • AT1 - East Austria; Burgenland (AT11), Lower Austria (AT12), Vienna (AT13)
  • AT2- South Austria; Carinthia (AT21), Styria (AT22)
  • AT3 West Austria; Upper Austria(AT31), Salzburg(AT32), Tyrol(AT30), Vorarlberg (AT34)

The current Nomenclature of Territorial Units for Statistics (NUTS) adopted by the European Union (Commission Delegated Regulation 2019/1755) is applied.

String eurostat:nuts2021Code [0.0.NTSCD55]

n/a

station_id

The NOAA weather station identifier

Integer number Integer [0.0.NTGER313]

n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 28. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
yard_id 2 - 3 446.0 73 370.75 453.5 586.25 664 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 242 ( 100.0% )
elevation 3 - 4 510.0 150 324 450 637.75 1,413 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 164 ( 67.8% )
nuts 5 - 5 n/a AT111 n/a n/a n/a AT342 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 32 ( 13.2% )
station_id 6 - 6 111,866.0 110,010 110,600 111,750 112,960 113,900 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 30.2% )

Quality measures

Table 29. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
yard_id
100.00%
100.00%
73 73
elevation
100.00%
67.77%
450 194
nuts
100.00%
13.22%
AT221 AT314
station_id
100.00%
30.17%
111750 112440

Changes made to preparatory file

n/a

Changes made to data

n/a

Unresolved issues

n/a

References

  1. Rubinigg M., MacDonald M., Davenport V., Hassler E., Hassan A., Shala-Mayrhofer V. et al. 2023 Predicting Varroa: Longitudinal Data, Micro Climate, and Proximity Closeness Useful for Predicting Varroa Infestations (I1.A1). Data & Analytics for Good. [2023-11-4] data-for-good.pubpub.org

Annex 1: Table column reports

Table: hive

Column: hive_id

Table 30. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name hive_id
Description

The hive identifier

Data type Integer number
Descriptor pms:beehiveID [UID:0.0.HVEID216]
Descriptor description

Unique sequence of characters associated with a beehive, which is specific to a dataset, to an apiary or to a beekeeper.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.HVEID216
Unit

n/a

Table 31. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
hive_id 1 - 4 1,187.7 1 562.25 1,187.5 1,788.75 2,501 2,116 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,116 ( 100.0% )
Table 32. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
hive_id
100.00%
100.00%
1 1

Continuous Data Distribution

Figure 1. Distribution of values in the column.

Outliers

Figure 2. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 3. Visualization of completeness of the data in the column.

Uniqueness

Figure 4. Visualization of uniqueness of the data in the column.

Column: user_id

Table 33. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name user_id
Description

The user identifier

Data type Integer number
Descriptor pms:userID [UID:0.0.SERID483]
Descriptor description

A user is a person who utilizes a computer or network service. A user often has a user account and is identified to the system by a username (or user name).

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.SERID483
Unit

n/a

Table 34. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
user_id 2 - 4 8,001.2 10 8,383 8,418 8,509 9,128 2,116 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 103 ( 4.9% )
Table 35. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
user_id
100.00%
4.87%
8418 8310

Continuous Data Distribution

Figure 5. Distribution of values in the column.

Outliers

Figure 6. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 7. Visualization of completeness of the data in the column.

Uniqueness

Figure 8. Visualization of uniqueness of the data in the column.

Table: station

Column: station_id

Table 36. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name station_id
Description

The NOAA weather station identifier

Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

n/a

Table 37. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_id 6 - 6 112,055.1 110,010 110,825 112,200 113,030 113,900 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )
Table 38. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_id
100.00%
100.00%
110010 110010

Continuous Data Distribution

Figure 9. Distribution of values in the column.

Outliers

Figure 10. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 11. Visualization of completeness of the data in the column.

Uniqueness

Figure 12. Visualization of uniqueness of the data in the column.

Column: station_title

Table 39. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name station_title
Description

The NOAA weather Station Name

Data type String
Descriptor Text [UID:0.0.TEXTA315]
Descriptor description

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit

n/a

Table 40. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_title 4 - 26 n/a ALBERSCHWEND… n/a n/a n/a ZELTWEG/AUTO… 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )
Table 41. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_title
100.00%
100.00%
WOLFSEGG WOLFSEGG

Completeness

Figure 13. Visualization of completeness of the data in the column.

Uniqueness

Figure 14. Visualization of uniqueness of the data in the column.

Column: latitude

Table 42. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name latitude
Description

Latitude coordinates of the station in decimal degrees in WGS84 standard.

Data type Decimal number
Descriptor dwc:decimalLatitude [UID:0.0.LTTDE333]
Descriptor description

A term from the Darwin Core standard:

The geographic latitude (in decimal degrees, using the spatial reference system given in dwc:geodeticDatum) of the geographic center of a dcterms:Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.

Descriptor target IRI http://rs.tdwg.org/dwc/terms/decimalLatitude
Unit

°

Table 43. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
latitude 2 - 6 47.5721 46.617 47.075 47.45 48.175 48.683 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 56 ( 76.7% )
Table 44. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
latitude
100.00%
76.71%
48.567 48.1

Continuous Data Distribution

Figure 15. Distribution of values in the column.

Outliers

Figure 16. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 17. Visualization of completeness of the data in the column.

Uniqueness

Figure 18. Visualization of uniqueness of the data in the column.

Column: longitude

Table 45. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name longitude
Description

Longitude coordinates of the station in decimal degrees in WGS84 standard.

Data type Decimal number
Descriptor dwc:decimalLongitude [UID:0.0.LNGTD332]
Descriptor description

A term from the Darwin Core standard:

The geographic longitude (in decimal degrees, using the spatial reference system given in dwc:geodeticDatum) of the geographic center of a dcterms:Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.

Descriptor target IRI http://rs.tdwg.org/dwc/terms/decimalLongitude
Unit

°

Table 46. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
longitude 2 - 6 14.4541 9.617 13.558 14.744 15.7665 16.6 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 66 ( 90.4% )
Table 47. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
longitude
100.00%
90.41%
16.367 13.667

Continuous Data Distribution

Figure 19. Distribution of values in the column.

Outliers

Figure 20. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 21. Visualization of completeness of the data in the column.

Uniqueness

Figure 22. Visualization of uniqueness of the data in the column.

Column: station_elevation

Table 48. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name station_elevation
Description

Meters above Sea Level

Data type Decimal number
Descriptor pms:heightAboveMeanSeaLevel [UID:0.0.HGHTB393]
Descriptor description

Height above mean sea level is a measure of the vertical distance (height, elevation or altitude) of a location in reference to a historic mean sea level taken as a vertical datum. In geodesy, it is formalized as orthometric heights.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.HGHTB393
Unit

m

Table 49. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_elevation 3 - 6 536.72 153 306.05 486 715.7 1,209.7 73 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 100.0% )
Table 50. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_elevation
100.00%
100.00%
615.6 615.6

Continuous Data Distribution

Figure 23. Distribution of values in the column.

Outliers

Figure 24. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 25. Visualization of completeness of the data in the column.

Uniqueness

Figure 26. Visualization of uniqueness of the data in the column.

Table: user

Column: user_id

Table 51. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name user_id
Description

The user identifier

Data type Integer number
Descriptor pms:userID [UID:0.0.SERID483]
Descriptor description

A user is a person who utilizes a computer or network service. A user often has a user account and is identified to the system by a username (or user name).

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.SERID483
Unit

n/a

Table 52. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
user_id 2 - 4 8,403.9 10 8,354 8,411 8,617 9,128 99 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 99 ( 100.0% )
Table 53. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
user_id
100.00%
100.00%
10 10

Continuous Data Distribution

Figure 27. Distribution of values in the column.

Outliers

Figure 28. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 29. Visualization of completeness of the data in the column.

Uniqueness

Figure 30. Visualization of uniqueness of the data in the column.

Column: samples

Table 54. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name samples
Description

Total numbers of samples provided by a user

Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

no.

Table 55. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
samples 1 - 4 111.6 1 10 25 96 3,058 99 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 65 ( 65.7% )
Table 56. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
samples
100.00%
65.66%
10 3058

Continuous Data Distribution

Figure 31. Distribution of values in the column.

Outliers

Figure 32. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 33. Visualization of completeness of the data in the column.

Uniqueness

Figure 34. Visualization of uniqueness of the data in the column.

Table: Varroa sampling

Column: sampling_id

Table 57. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name sampling_id
Description

The sampling event identifier

Data type String
Descriptor dwc:materialSampleID [UID:0.0.MTRLS489]
Descriptor description

A term from the Darwin Core standard:

An identifier for the dwc:MaterialSample (as opposed to a particular digital record of the dwc:MaterialSample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the dwc:materialSampleID globally unique.

Descriptor target IRI http://rs.tdwg.org/dwc/terms/materialSampleID
Unit

n/a

Table 58. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
sampling_id 1 - 5 5,788.6 1 2,953.25 5,828.5 8,625.75 11,427 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 11,124 ( 100.0% )
Table 59. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
sampling_id
100.00%
100.00%
1 1

Continuous Data Distribution

Figure 35. Distribution of values in the column.

Outliers

Figure 36. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 37. Visualization of completeness of the data in the column.

Uniqueness

Figure 38. Visualization of uniqueness of the data in the column.

Column: date_from

Table 60. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name date_from
Description

The first date (year, month, day) and time (hours, minutes) of the sampling event

Data type String
Descriptor Text [UID:0.0.TEXTA315]
Descriptor description

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit

n/a

Table 61. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
date_from 11 - 14 n/a 1/1/19 13:00 n/a n/a n/a 9/9/18 15:30 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 1,903 ( 17.1% )
Table 62. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
date_from
100.00%
17.11%
4/4/20 8:00 11/17/17 15:20

Completeness

Figure 39. Visualization of completeness of the data in the column.

Uniqueness

Figure 40. Visualization of uniqueness of the data in the column.

Column: date_to

Table 63. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name date_to
Description

The final date (year, month, day) and time (hours, minutes) of the sampling event

Data type String
Descriptor Text [UID:0.0.TEXTA315]
Descriptor description

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit

n/a

Table 64. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
date_to 11 - 14 n/a 1/12/18 16:0… n/a n/a n/a 9/9/20 18:00 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,227 ( 20.0% )
Table 65. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
date_to
100.00%
20.02%
4/4/20 11:00 4/20/17 17:20

Completeness

Figure 41. Visualization of completeness of the data in the column.

Uniqueness

Figure 42. Visualization of uniqueness of the data in the column.

Column: varroa_count

Table 66. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name varroa_count
Description

The number of varroa mites found in the sampling event

Data type Decimal number
Descriptor pms:naturalVarroaMiteFall [UID:0.0.NMBRF371]
Descriptor description

The quantity infestation rate of adult honey bee colonies with Varroa mites (Varroa destructor), measured as natural mite fall on a sticky board placed under the brood nest of a honey bee colony, expressed in number of Varroa mites per day.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NMBRF371
Unit

mites d-1

Table 67. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
varroa_count 1 - 4 28.3 0 1 4 16 5,016 11,124 0 ( 0.0% ) 2,301 ( 20.7% ) 0 ( 0.0% ) 387 ( 3.5% )
Table 68. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
varroa_count
100.00%
3.48%
0 700

Continuous Data Distribution

Figure 43. Distribution of values in the column.

Outliers

Figure 44. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 45. Visualization of completeness of the data in the column.

Uniqueness

Figure 46. Visualization of uniqueness of the data in the column.

Column: quality_control

Table 69. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name quality_control
Description

The quality level of the sample collected

  • 2 = examined with the BeeVS diagnostic system
  • 1 = examined manually by a trained group.
  • 0 = examined manually by untrained individuals
Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

n/a

Table 70. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
quality_control 1 - 1 0.9 0 0 1 2 2 11,124 0 ( 0.0% ) 3,824 ( 34.4% ) 0 ( 0.0% ) 3 ( 0.0% )
Table 71. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
quality_control
100.00%
0.03%
1 2

Data Distribution Top 20

Figure 47. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 48. Distribution of values in the column.

Outliers

Figure 49. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 50. Visualization of completeness of the data in the column.

Uniqueness

Figure 51. Visualization of uniqueness of the data in the column.

Column: hive_id

Table 72. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name hive_id
Description

The hive identifier

Data type String
Descriptor pms:beehiveID [UID:0.0.HVEID216]
Descriptor description

Unique sequence of characters associated with a beehive, which is specific to a dataset, to an apiary or to a beekeeper.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.HVEID216
Unit

n/a

Table 73. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
hive_id 1 - 4 820.6 1 289 715 1,199 2,501 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 2,116 ( 19.0% )
Table 74. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
hive_id
100.00%
19.02%
943 431

Continuous Data Distribution

Figure 52. Distribution of values in the column.

Outliers

Figure 53. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 54. Visualization of completeness of the data in the column.

Uniqueness

Figure 55. Visualization of uniqueness of the data in the column.

Column: yard_id

Table 75. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name yard_id
Description

The yard identifier

Data type String
Descriptor pms:apiaryID [UID:0.0.PRYID342]
Descriptor description

Unique sequence of characters associated with an apiary, which is specific to a dataset or to a beekeeper.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.PRYID342
Unit

n/a

Table 76. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
yard_id 2 - 3 370.5 73 194 391 522 664 11,124 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 242 ( 2.2% )
Table 77. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
yard_id
100.00%
2.18%
87 404

Continuous Data Distribution

Figure 56. Distribution of values in the column.

Outliers

Figure 57. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 58. Visualization of completeness of the data in the column.

Uniqueness

Figure 59. Visualization of uniqueness of the data in the column.

Table: weather

Column: station_id

Table 78. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name station_id
Description
Data type Integer number
Descriptor pms:recordID [UID:0.0.RCRDD344]
Descriptor description

Unique sequence of integers associated with a record within a certain table.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.RCRDD344
Unit

n/a

Table 79. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_id 6 - 6 111,896.2 110,010 110,600 112,130 112,960 113,900 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 72 ( 0.0% )
Table 80. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_id
100.00%
0.01%
112400 112280

Data Distribution Top 20

Figure 60. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 61. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 62. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 63. Visualization of completeness of the data in the column.

Uniqueness

Figure 64. Visualization of uniqueness of the data in the column.

Column: date

Table 81. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name date
Description
Data type Date
Descriptor iso-8601:calendarDate [UID:0.0.DATEA317]
Descriptor description

particular calendar day [...] represented by its calendar year [...], its calendar month [...] and its calendar day of month [...]

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DATEA317
Unit

n/a

Table 82. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
date 10 - 10 2,018.0 2012-01-01 2,017 2,018 2,019 2020-12-07 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 3,244 ( 0.2% )
Table 83. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
date
100.00%
0.24%
2017-12-01 2015-04-08

Outliers

Figure 65. Visualization of median, min, max, and outliers in the column.
No data available.

Completeness

Figure 66. Visualization of completeness of the data in the column.

Uniqueness

Figure 67. Visualization of uniqueness of the data in the column.

Column: hour

Table 84. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name hour
Description
Data type String
Descriptor iso-8601:clock hour [UID:0.0.HRFDY386]
Descriptor description

time scale unit [...] whose duration [...] is one hour [...] Clock hour is in common parlance often referred to as hour, however in this document clock hour and hour have different definitions.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.HRFDY386
Unit

n/a

Table 85. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
hour 14 - 14 n/a 00:00:00+00:… n/a n/a n/a 23:00:00+00:… 1,325,815 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 24 ( 0.0% )
Table 86. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
hour
100.00%
0.00%
06:00:00+00:00 02:00:00+00:00

Data Distribution Top 20

Figure 68. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 69. Distribution of 20 least common values, from lowest to highest.

Completeness

Figure 70. Visualization of completeness of the data in the column.

Uniqueness

Figure 71. Visualization of uniqueness of the data in the column.

Column: air_temp

Table 87. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name air_temp
Description
Data type Decimal number
Descriptor DecimalNumber [UID:0.0.DCMLN314]
Descriptor description

Any of the rational or irrational numbers.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit

n/a

Table 88. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
air_temp 1 - 5 10.15 -23 3.4 10 16.6 38.6 1,325,815 3,907 ( 0.3% ) 8,733 ( 0.7% ) 0 ( 0.0% ) 593 ( 0.0% )
Table 89. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
air_temp
99.71%
0.04%
14 37.5

Data Distribution Top 20

Figure 72. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 73. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 74. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 75. Visualization of completeness of the data in the column.

Uniqueness

Figure 76. Visualization of uniqueness of the data in the column.

Column: dew_point

Table 90. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name dew_point
Description
Data type Decimal number
Descriptor DecimalNumber [UID:0.0.DCMLN314]
Descriptor description

Any of the rational or irrational numbers.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit

n/a

Table 91. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
dew_point 1 - 5 5.28 -31 0.2 5.4 11.2 27.5 1,325,815 4,974 ( 0.4% ) 8,678 ( 0.7% ) 0 ( 0.0% ) 503 ( 0.0% )
Table 92. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
dew_point
99.62%
0.04%
0 25.9

Data Distribution Top 20

Figure 77. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 78. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 79. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 80. Visualization of completeness of the data in the column.

Uniqueness

Figure 81. Visualization of uniqueness of the data in the column.

Column: pressure

Table 93. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name pressure
Description
Data type Decimal number
Descriptor pms:atmosphericPressure [UID:0.0.TMSPH396]
Descriptor description

Atmospheric pressure, also known as air pressure or barometric pressure (after the barometer), is the pressure within the atmosphere of Earth.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TMSPH396
Unit
Table 94. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
pressure 3 - 6 1,017.71 943.9 1,012.6 1,017.5 1,022.8 1,050.9 1,325,815 228,763 ( 17.3% ) 0 ( 0.0% ) 0 ( 0.0% ) 714 ( 0.1% )
Table 95. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
pressure
82.75%
0.05%
null 976.8

Data Distribution Top 20

Figure 82. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 83. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 84. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 85. Visualization of completeness of the data in the column.

Uniqueness

Figure 86. Visualization of uniqueness of the data in the column.

Column: wind_dir

Table 96. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name wind_dir
Description
Data type Integer number
Descriptor pms:windDirection [UID:0.0.WNDDR475]
Descriptor description

The true direction from which the wind is blowing at a given location (i.e., wind blowing from the north to the south is a north wind). It is normally measured in tens of degrees from 10 degrees clockwise through 360 degrees. North is 360 degrees. A wind direction of 0 degrees is only used when wind is calm.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.WNDDR475
Unit

°

Table 97. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
wind_dir 1 - 3 194.9 0 90 220 290 360 1,325,815 53,660 ( 4.0% ) 104,854 ( 7.9% ) 0 ( 0.0% ) 38 ( 0.0% )
Table 98. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
wind_dir
95.95%
0.00%
360 10

Data Distribution Top 20

Figure 87. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 88. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 89. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 90. Visualization of completeness of the data in the column.

Uniqueness

Figure 91. Visualization of uniqueness of the data in the column.

Column: wind_spd

Table 99. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name wind_spd
Description
Data type Decimal number
Descriptor pms:windSpeed [UID:0.0.WNDSP474]
Descriptor description

The rate at which air is moving horizontally past a given point. It may be a 2-minute average speed (reported as wind speed) or an instantaneous speed (reported as a peak wind speed, wind gust, or squall).

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.WNDSP474
Unit

m s-1

Table 100. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
wind_spd 1 - 4 2.24 0 1 2 3 28 1,325,815 102,751 ( 7.8% ) 61,122 ( 4.6% ) 0 ( 0.0% ) 60 ( 0.0% )
Table 101. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
wind_spd
92.25%
0.00%
1 24

Data Distribution Top 20

Figure 92. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 93. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 94. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 95. Visualization of completeness of the data in the column.

Uniqueness

Figure 96. Visualization of uniqueness of the data in the column.

Column: sky_cond

Table 102. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name sky_cond
Description
Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

n/a

Table 103. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
sky_cond 1 - 1 5.0 0 2 6 8 9 1,325,815 1,175,710 ( 88.7% ) 18,374 ( 1.4% ) 0 ( 0.0% ) 11 ( 0.0% )
Table 104. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
sky_cond
11.32%
0.00%
8 9

Data Distribution Top 20

Figure 97. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 98. Distribution of values in the column.

Outliers

Figure 99. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 100. Visualization of completeness of the data in the column.

Uniqueness

Figure 101. Visualization of uniqueness of the data in the column.

Column: precip_1hr

Table 105. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name precip_1hr
Description
Data type Decimal number
Descriptor DecimalNumber [UID:0.0.DCMLN314]
Descriptor description

Any of the rational or irrational numbers.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit

n/a

Table 106. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
precip_1hr 1 - 4 0.23 -1 -1 0.1 0.5 69 1,325,815 1,148,579 ( 86.6% ) 34,978 ( 2.6% ) 0 ( 0.0% ) 65 ( 0.0% )
Table 107. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
precip_1hr
13.37%
0.00%
null 69

Data Distribution Top 20

Figure 102. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 103. Distribution of 20 least common values, from lowest to highest.

Outliers

Figure 104. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 105. Visualization of completeness of the data in the column.

Uniqueness

Figure 106. Visualization of uniqueness of the data in the column.

Column: precip_6hr

Table 108. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name precip_6hr
Description
Data type Decimal number
Descriptor DecimalNumber [UID:0.0.DCMLN314]
Descriptor description

Any of the rational or irrational numbers.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.DCMLN314
Unit

n/a

Table 109. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
precip_6hr 1 - 4 0.94 -1 0 0 0.3 99 1,325,815 1,250,100 ( 94.3% ) 47,825 ( 3.6% ) 0 ( 0.0% ) 91 ( 0.0% )
Table 110. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
precip_6hr
5.71%
0.01%
null 49

Data Distribution Top 20

Figure 107. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 108. Distribution of 20 least common values, from lowest to highest.

Completeness

Figure 109. Visualization of completeness of the data in the column.

Uniqueness

Figure 110. Visualization of uniqueness of the data in the column.

Table: yard

Column: yard_id

Table 111. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name yard_id
Description

The yard identifier

Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

n/a

Table 112. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
yard_id 2 - 3 446.0 73 370.75 453.5 586.25 664 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 242 ( 100.0% )
Table 113. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
yard_id
100.00%
100.00%
73 73

Continuous Data Distribution

Figure 111. Distribution of values in the column.

Outliers

Figure 112. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 113. Visualization of completeness of the data in the column.

Uniqueness

Figure 114. Visualization of uniqueness of the data in the column.

Column: elevation

Table 114. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name elevation
Description

Meters above Sea Level rounded to the nearest meter

Data type Integer number
Descriptor pms:heightAboveMeanSeaLevel [UID:0.0.HGHTB393]
Descriptor description

Height above mean sea level is a measure of the vertical distance (height, elevation or altitude) of a location in reference to a historic mean sea level taken as a vertical datum. In geodesy, it is formalized as orthometric heights.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.HGHTB393
Unit

m

Table 115. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
elevation 3 - 4 510.0 150 324 450 637.75 1,413 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 164 ( 67.8% )
Table 116. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
elevation
100.00%
67.77%
450 194

Continuous Data Distribution

Figure 115. Distribution of values in the column.

Outliers

Figure 116. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 117. Visualization of completeness of the data in the column.

Uniqueness

Figure 118. Visualization of uniqueness of the data in the column.

Column: nuts

Table 117. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name nuts
Description

NUTS is a geocode standard for referencing the administrative divisions of countries for statistical purposes.

  • AT1 - East Austria; Burgenland (AT11), Lower Austria (AT12), Vienna (AT13)
  • AT2- South Austria; Carinthia (AT21), Styria (AT22)
  • AT3 West Austria; Upper Austria(AT31), Salzburg(AT32), Tyrol(AT30), Vorarlberg (AT34)

The current Nomenclature of Territorial Units for Statistics (NUTS) adopted by the European Union (Commission Delegated Regulation 2019/1755) is applied.

Data type String
Descriptor eurostat:nuts2021Code [UID:0.0.NTSCD55]
Descriptor description

A NUTS code defined in the NUTS classification 2021, valid from 2021-01-01 to 2023-12-31, containing 92 regions at NUTS level 1, 244 regions at NUTS level 2 and 1165 regions at NUTS level 3 level.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTSCD55
Unit

n/a

Table 118. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
nuts 5 - 5 n/a AT111 n/a n/a n/a AT342 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 32 ( 13.2% )
Table 119. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
nuts
100.00%
13.22%
AT221 AT314

Completeness

Figure 119. Visualization of completeness of the data in the column.

Uniqueness

Figure 120. Visualization of uniqueness of the data in the column.

Column: station_id

Table 120. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Parameter Content
Column name station_id
Description

The NOAA weather station identifier

Data type Integer number
Descriptor Integer [UID:0.0.NTGER313]
Descriptor description

A number with no fractional part, including the negative and positive numbers as well as zero.

Descriptor target IRI https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit

n/a

Table 121. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).
Column Name Range Mean Minimum Q1 Median Q3 Maximum Total Missing Zero Blank Distinct
station_id 6 - 6 111,866.0 110,010 110,600 111,750 112,960 113,900 242 0 ( 0.0% ) 0 ( 0.0% ) 0 ( 0.0% ) 73 ( 30.2% )
Table 122. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).
Column Name Completeness Uniqueness Most Common Value Least Common Value
station_id
100.00%
30.17%
111750 112440

Continuous Data Distribution

Figure 121. Distribution of values in the column.

Outliers

Figure 122. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 123. Visualization of completeness of the data in the column.

Uniqueness

Figure 124. Visualization of uniqueness of the data in the column.