, ,
• •

Dataset Report

Unique identifier:	PSTCD4.0.0
Title:	Postcode
Long title:	Postcodes from various countries
Status:	Quality Validated
Current Version:	v. 1.0
Published:	2023-09-13
Reviewed by:
Citation proposal:	EU Pollinator Hub 2023 Report of dataset Postcode, v. 1.0 [PSTCD4.0.0]. EU Pollinator Hub. [2026-05-16] app.pollinatorhub.eu

Compliance with FAIR* principles
Findable	Accessible	Interoperable	Reusable
See https://www.go-fair.org/fair-principles for more information about FAIR principles

Data Quality

Under evaluation

Table of content

Document history
1. Release
2. Revision
Abbreviations
Executive summary
Introduction
Material and methods
Data description
1. Dataset
2. Tables
  1. Worldwide postcodes
References
Annex 1: Table column reports

Document history

Release

Version v. 1.0 released on 2023-09-13.

Revision

Table 1. List of revisions made to the document. Identifier of revision (No); date of revision (Date); description of revision (Description); reason for revision (Reason).

No	Date	Description	Reason
1	2023-09-13 00:09:00	Initial release.	n/a

Abbreviations

No abbreviations.

Executive summary

Data overview:

Data value:

Data was collected to be used internally on the EU Pollinator Platform (EUPH).

Data description:

The dataset contains 1 table with a total of 1.534.012 records (171.640.227 bytes) .

Data application:

Data will be exclusively used for backend administration of the EU Pollinator Platform (EUPH), in particular for standardisation of data and for interactions with users.

Unresolved issues:

n/a

Introduction

Data was obtained from GeoNames (http://www.geonames.org/). It contains an incomplete collection of postcodes from 97 countries worldwide. It is licensed under a Creative Commons Attribution 4.0 license. It will be exclusively used for backend administration of the EU Pollinator Platform (EUPH), in particular for standardisation of data and for interactions with users.

Material and methods

Data acquisition

Data was obtained from the GeoNames geographical database (www.geonames.org), a project founded by Marc Wick and maintained by Unxos GmbH, Switzerland. The data is licensed under a Creative Commons Attribution 4.0 License. It can be used if credit is given to GeoNames (at least by a link to www.geonames.org). The data is provided without warranty or any representation of accuracy, timeliness or completeness.

Postcodes were obtained from the file allCountries.zip. Postcodes from Canada (CA), Great Briton (GB) and the Netherlands (NL) were substituted with the postcodes obtained from the files CA_full.csv.zip, GB_full.csv.zip and NL_full.csv.zip, respectively. At the time of data acquisition (2023-09-13) 97 countries were supported. For many countries latitude and longitude are determined with an algorithm that searches the place names in the main geonames database using administrative divisions and numerical vicinity of the postal codes as factors in the disambiguation of place names. For postal codes and place name for which no corresponding toponym in the main geonames database could be found an average latitude and longitude of 'neighbouring' postal codes is calculated. For copyright reasons, for Chile only the first digits, for Ireland only the first letters and for Malta only the first letters of the full postal codes are provided. For Argentina the first 5 positions of the postal code and for Brazil only major postal codes (only the codes ending with -000 and the major code per municipality) are available.

Table 2. List of raw data and metadata files included in the dataset. Identifier of table row (No); name of the file (File); the type of the file (Type); file contains data (D); file contains metadata (M); date of upload of the file to the EU Pollinator Hub (Arrival); number of data points contained within the file (if applicable); uploaded file size.

No	File	Type	D	M	Arrival	Data points	File size
1	allcountries.csv	CSV - Comma seperated values	Yes	No	2025-09-30 12:09:03	18,609,576	163.69 MiB

Data preparation

Zipped files containing the raw data were unpacked. Unpacked raw data files contained tab separated values and were converted to files in csv format according to WI-002 (Raw data preparation) of SOP-006 (Dataset preparation) using the script ConvertTsv2Csv.py executed with the IDE PyCharm (Version 2023.1.2 Community Edition, JetBrains s.r.o). Data contained in raw data files allCountries.zip, CA_full.zip and NL_full.csv.zip were imported for profiling into a SQL database (MariaDB foundation, Server-Version 10.4.24) running in a XAMPP environment (BitRock, version 3.3.0). Data types of tables were configured using the information contained in in metadata file readme.txt.

Data validation

n/a

Data analysis

n/a

Data description

Dataset

Table 3. Summary of tables belonging to the dataset. Table row identifier (No); name of the table (Table); description of the table (Description).

No	Table	Description
1	Worldwide postcodes	The table contains 1.534.012 records (171.640.227 bytes) with postcodes from 97 distinct countries (PT, IN, JP, MX, SG, PE, PL,…

Table 4. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
interactions.single.uid	PSTCD4.0.0
Title	Postcode
Long title	Postcodes from various countries
Target IRI	https://app.pollinatorhub.eu/dataset-discovery/PSTCD4.0.0
interactions.single.section-details.licence	CC BY 4.0
DOI	n/a
Created	2023-01-26
Published	2023-09-13
Contact	n/a
Keywords	n/a
Data collection years	2023
Regions, the data was collected in	Algeria, American Samoa, Andorra, Belarus, Bulgaria, Canada, Chile, Colombia, Costa Rica, Croatia, Cyprus, Czechia, Denmark, Ecuador, Estonia, Faroe Islands (the), Finland, France
Abstract	The dataset contains postcodes from 97 countries for internal use on the EUPH.

Table 5. Standardised metadata of the data provider EU Pollinator Hub. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Name	EU Pollinator Hub
Url
Acronym	EUPH
IRI	https://app.pollinatorhub.eu/data-providers/euph
Address
Country	Belgium
Contact	https://www.linkedin.com/company/beelife-european-beekeeping-coordination/ pollinatorhub.eu
Description	The EU Pollinator Hub (EUPH) is a data hub related to pollinators, which is provided by the European Food Safety Authority (EFSA).

Tables

Worldwide postcodes

Table 6. Standardised metadata of the dataset. Reported parameter (Parameter); content of the parameter (Content).

Parameter	Content
Unique identifier	PSTCD4.WRLDW72.0
Name	Worldwide postcodes
Target IRI	https://app.pollinatorhub.eu/dataset-discovery/parts/PSTCD4.WRLDW72.0
Table Type	File
Licence	CC BY 4.0
Description	The table contains 1.534.012 records (171.640.227 bytes) with postcodes from 97 distinct countries (PT, IN, JP, MX, SG, PE, PL, FR, RU, US, RO, ES, TR, KR, UA, GB, LT, AR, AT, IT, SE, AU, DE, DZ, CZ, HR, LV, BR, EE, BG, NO, LU, CH, SK, NL, ZA, ZA, CO, FI, HU, BY, MY, BE, PK, PH, UY, LK, MD, NZ, CA, BD, MA, EC, AZ, DK, RS, CY, TH, SI, GT, DO, MW, CR, CL, HT, MK, PR, RE, IS, FO, BM, GP, MQ, IM, GF, MT, NC, AX, GL, MC, SM, YT, GU, VI, GG, LI, SJ, AD, JE, FM, MP, WF, MH, PM, PW, AS, VA).

The table contains 1.534.012 records (171.640.227 bytes) with postcodes from 97 distinct countries (PT, IN, JP, MX, SG, PE, PL, FR, RU, US, RO, ES, TR, KR, UA, GB, LT, AR, AT, IT, SE, AU, DE, DZ, CZ, HR, LV, BR, EE, BG, NO, LU, CH, SK, NL, ZA, ZA, CO, FI, HU, BY, MY, BE, PK, PH, UY, LK, MD, NZ, CA, BD, MA, EC, AZ, DK, RS, CY, TH, SI, GT, DO, MW, CR, CL, HT, MK, PR, RE, IS, FO, BM, GP, MQ, IM, GF, MT, NC, AX, GL, MC, SM, YT, GU, VI, GG, LI, SJ, AD, JE, FM, MP, WF, MH, PM, PW, AS, VA).

Metadata

• Column countrycode links to countries.iso3166_1_2020.alpha2code

Table 7. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Column Description	Datatype	Descriptor	Unit
countrycode	A two-letter code that represents the country name, recommended by ISO standard 3166-1:2020.	String	iso-639:alpha-2LanguageCode [0.0.LPHLN110]	n/a
postalcode	Postal code.	String	eurostat:postcode [0.0.PSTCD378]	n/a
placename	Name of the location.	String	Text [0.0.TEXTA315]	n/a
adminname1	order subdivision (state).	String	Text [0.0.TEXTA315]	n/a
admincode1	order subdivision (state).	String	Text [0.0.TEXTA315]	n/a
adminname2	order subdivision (county/province).	String	Text [0.0.TEXTA315]	n/a
admincode2	order subdivision (county/province).	String	Text [0.0.TEXTA315]	n/a
adminname3	order subdivision (community).	String	Text [0.0.TEXTA315]	n/a
admincode3	order subdivision (community).	String	Text [0.0.TEXTA315]	n/a
latitude	estimated latitude (wgs84).	Decimal number	dwc:decimalLatitude [0.0.LTTDE333]	°
longitude	estimated longitude (wgs84).	Decimal number	dwc:decimalLongitude [0.0.LNGTD332]	°
accuracy	accuracy of lat/lng from 1=estimated, 4=geonameid, 6=centroid of addresses or shape.	Integer number	Integer [0.0.NTGER313]	n/a

Metadata of individual tables can be found in Annex 1.

Descriptive measures

Table 8. Content analysis of the table. Column name (Name); range of length of characters (Length); arithmetic mean of values in column (Mean); lowest value in column (Min); first quartile of values in column (Q1); median of values in column (Median); third quartile of values in column (Q3); highest value in column (Max); number of records (Total); number and percentage (between brackets) of all values containing NULL (Missing), the value 0 (Zero), exclusively blank characters (Blank), and of distinct values including NULL, Zero and blank (Distinct).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
countrycode	2 - 2	n/a	AD	n/a	n/a	n/a	ZA	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	97 ( 0.0% )
postalcode	2 - 15	n/a	M9	n/a	n/a	n/a	78177 CITYSS…	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	721,060 ( 46.5% )
placename	0 - 161	n/a		n/a	n/a	n/a	Hamilton (So…	1,550,798	1 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	792,066 ( 51.1% )
adminname1	0 - 48	n/a		n/a	n/a	n/a	Región del L…	1,550,798	131,031 ( 8.4% )	0 ( 0.0% )	0 ( 0.0% )	1,560 ( 0.1% )
admincode1	0 - 9	n/a		n/a	n/a	n/a	L93000001	1,550,798	136,574 ( 8.8% )	0 ( 0.0% )	0 ( 0.0% )	465 ( 0.0% )
adminname2	0 - 49	n/a		n/a	n/a	n/a	Dolores Hida…	1,550,798	256,764 ( 16.6% )	0 ( 0.0% )	0 ( 0.0% )	15,043 ( 1.0% )
admincode2	0 - 9	n/a		n/a	n/a	n/a	S12000017	1,550,798	331,793 ( 21.4% )	0 ( 0.0% )	0 ( 0.0% )	12,201 ( 0.8% )
adminname3	0 - 51	n/a		n/a	n/a	n/a	San Leonardo…	1,550,798	744,704 ( 48.0% )	0 ( 0.0% )	0 ( 0.0% )	40,539 ( 2.6% )
admincode3	0 - 9	n/a		n/a	n/a	n/a	W06000011	1,550,798	1,121,361 ( 72.3% )	0 ( 0.0% )	0 ( 0.0% )	24,021 ( 1.5% )
latitude	8 - 10	30.0224925	-89.997600	19.6338	37.19295	45.0161	90.000000	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	372,372 ( 24.0% )
longitude	8 - 11	21.6569758	-179.260000	-8.8368	16.6563	81.315625	179.310000	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	509,297 ( 32.8% )
accuracy	1 - 1	3.7	1	3	4	4	6	1,550,798	274,588 ( 17.7% )	0 ( 0.0% )	0 ( 0.0% )	7 ( 0.0% )

Quality measures

Table 9. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
countrycode	100.00%	0.01%	PT	AS
postalcode	100.00%	46.50%	21825	AD100
placename	100.00%	51.07%	Lisboa	Encamp
adminname1	91.55%	0.10%	null	Canillo
admincode1	91.19%	0.03%	null	L93000001
adminname2	83.44%	0.97%	null	Rust Stadt
admincode2	78.61%	0.79%	null	941
adminname3	51.98%	2.61%	null	Rust
admincode3	27.69%	1.55%	null	10320
latitude	100.00%	24.01%	38.716700	-24.733300
longitude	100.00%	32.84%	-9.133300	-63.772200
accuracy	82.29%	0.00%	4	2

Changes made to preparatory file

Empty data fields were replaced with NULL in file allCountries_postcode_PREP_MR_230913.csv, as specified in the script used for the conversion of files to csv format according to WI-002 (Raw data preparation) of SOP-006 (Dataset preparation) using the script ConvertTsv2Csv.py.

A total of 788 records were duplicates of 320 records. Most duplicated records occurred in Japan (73%), followed by Switzerland (13%), Mexico (7%), India (4%), Lithuania (2%), Peru, Belarus and France (<1%). Duplicates were removed from the removed from the file allCountries_postcode_PREP_MR_230913.csv.

Changes made to data

n/a

Unresolved issues

n/a

References

Anonymous GeoNames. (En) GeoNames. [2025-9-30] www.geonames.org

Annex 1: Table column reports

Table: Worldwide postcodes

Column: countrycode

Table 10. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	countrycode
Description	A two-letter code that represents the country name, recommended by ISO standard 3166-1:2020.
Data type	String
Descriptor	iso-639:alpha-2LanguageCode [UID:0.0.LPHLN110]
Descriptor description	ISO 639-2 is the alpha-3 code in Codes for the representation of names of languages-- Part 2. There are 21 languages that have alternative codes for bibliographic or terminology purposes. In those cases, each is listed separately and they are designated as "B" (bibliographic) or "T" (terminology). In all other cases there is only one ISO 639-2 code. Multiple codes assigned to the same language are to be considered synonyms. ISO 639-1 is the alpha-2 code.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.LPHLN110
Unit	n/a

Table 11. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
countrycode	2 - 2	n/a	AD	n/a	n/a	n/a	ZA	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	97 ( 0.0% )

Table 12. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
countrycode	100.00%	0.01%	PT	AS

Data Distribution Top 20

Figure 1. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 2. Distribution of 20 least common values, from lowest to highest.

Completeness

Figure 3. Visualization of completeness of the data in the column.

Uniqueness

Figure 4. Visualization of uniqueness of the data in the column.

Column: postalcode

Table 13. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	postalcode
Description	Postal code.
Data type	String
Descriptor	eurostat:postcode [UID:0.0.PSTCD378]
Descriptor description	A postal code (also known locally in various English-speaking countries throughout the world as a postcode, post code, PIN or ZIP Code) is a series of letters or digits or both, sometimes including spaces or punctuation, included in a postal address for the purpose of sorting mail.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.PSTCD378
Unit	n/a

Table 14. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
postalcode	2 - 15	n/a	M9	n/a	n/a	n/a	78177 CITYSS…	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	721,060 ( 46.5% )

Table 15. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
postalcode	100.00%	46.50%	21825	AD100

Completeness

Figure 5. Visualization of completeness of the data in the column.

Uniqueness

Figure 6. Visualization of uniqueness of the data in the column.

Column: placename

Table 16. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	placename
Description	Name of the location.
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 17. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
placename	0 - 161	n/a		n/a	n/a	n/a	Hamilton (So…	1,550,798	1 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	792,066 ( 51.1% )

Table 18. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
placename	100.00%	51.07%	Lisboa	Encamp

Completeness

Figure 7. Visualization of completeness of the data in the column.

Uniqueness

Figure 8. Visualization of uniqueness of the data in the column.

Column: adminname1

Table 19. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	adminname1
Description	order subdivision (state).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 20. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
adminname1	0 - 48	n/a		n/a	n/a	n/a	Región del L…	1,550,798	131,031 ( 8.4% )	0 ( 0.0% )	0 ( 0.0% )	1,560 ( 0.1% )

Table 21. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
adminname1	91.55%	0.10%	null	Canillo

Completeness

Figure 9. Visualization of completeness of the data in the column.

Uniqueness

Figure 10. Visualization of uniqueness of the data in the column.

Column: admincode1

Table 22. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	admincode1
Description	order subdivision (state).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 23. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
admincode1	0 - 9	n/a		n/a	n/a	n/a	L93000001	1,550,798	136,574 ( 8.8% )	0 ( 0.0% )	0 ( 0.0% )	465 ( 0.0% )

Table 24. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
admincode1	91.19%	0.03%	null	L93000001

Data Distribution Top 20

Figure 11. Distribution of 20 most common values, from highest to lowest.

Data Distribution Bottom 20

Figure 12. Distribution of 20 least common values, from lowest to highest.

Completeness

Figure 13. Visualization of completeness of the data in the column.

Uniqueness

Figure 14. Visualization of uniqueness of the data in the column.

Column: adminname2

Table 25. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	adminname2
Description	order subdivision (county/province).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 26. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
adminname2	0 - 49	n/a		n/a	n/a	n/a	Dolores Hida…	1,550,798	256,764 ( 16.6% )	0 ( 0.0% )	0 ( 0.0% )	15,043 ( 1.0% )

Table 27. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
adminname2	83.44%	0.97%	null	Rust Stadt

Completeness

Figure 15. Visualization of completeness of the data in the column.

Uniqueness

Figure 16. Visualization of uniqueness of the data in the column.

Column: admincode2

Table 28. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	admincode2
Description	order subdivision (county/province).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 29. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
admincode2	0 - 9	n/a		n/a	n/a	n/a	S12000017	1,550,798	331,793 ( 21.4% )	0 ( 0.0% )	0 ( 0.0% )	12,201 ( 0.8% )

Table 30. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
admincode2	78.61%	0.79%	null	941

Completeness

Figure 17. Visualization of completeness of the data in the column.

Uniqueness

Figure 18. Visualization of uniqueness of the data in the column.

Column: adminname3

Table 31. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	adminname3
Description	order subdivision (community).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 32. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
adminname3	0 - 51	n/a		n/a	n/a	n/a	San Leonardo…	1,550,798	744,704 ( 48.0% )	0 ( 0.0% )	0 ( 0.0% )	40,539 ( 2.6% )

Table 33. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
adminname3	51.98%	2.61%	null	Rust

Completeness

Figure 19. Visualization of completeness of the data in the column.

Uniqueness

Figure 20. Visualization of uniqueness of the data in the column.

Column: admincode3

Table 34. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	admincode3
Description	order subdivision (community).
Data type	String
Descriptor	Text [UID:0.0.TEXTA315]
Descriptor description	In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.TEXTA315
Unit	n/a

Table 35. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
admincode3	0 - 9	n/a		n/a	n/a	n/a	W06000011	1,550,798	1,121,361 ( 72.3% )	0 ( 0.0% )	0 ( 0.0% )	24,021 ( 1.5% )

Table 36. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
admincode3	27.69%	1.55%	null	10320

Completeness

Figure 21. Visualization of completeness of the data in the column.

Uniqueness

Figure 22. Visualization of uniqueness of the data in the column.

Column: latitude

Table 37. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	latitude
Description	estimated latitude (wgs84).
Data type	Decimal number
Descriptor	dwc:decimalLatitude [UID:0.0.LTTDE333]
Descriptor description	The geographic latitude (in decimal degrees, using the spatial reference system given in dwc:geodeticDatum) of the geographic center of a dcterms:Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.
Descriptor target IRI	http://rs.tdwg.org/dwc/terms/decimalLatitude
Unit	°

Table 38. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
latitude	8 - 10	30.0224925	-89.997600	19.6338	37.19295	45.0161	90.000000	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	372,372 ( 24.0% )

Table 39. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
latitude	100.00%	24.01%	38.716700	-24.733300

Continuous Data Distribution

Figure 23. Distribution of values in the column.

Outliers

Figure 24. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 25. Visualization of completeness of the data in the column.

Uniqueness

Figure 26. Visualization of uniqueness of the data in the column.

Column: longitude

Table 40. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	longitude
Description	estimated longitude (wgs84).
Data type	Decimal number
Descriptor	dwc:decimalLongitude [UID:0.0.LNGTD332]
Descriptor description	The geographic longitude (in decimal degrees, using the spatial reference system given in dwc:geodeticDatum) of the geographic center of a dcterms:Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.
Descriptor target IRI	http://rs.tdwg.org/dwc/terms/decimalLongitude
Unit	°

Table 41. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
longitude	8 - 11	21.6569758	-179.260000	-8.8368	16.6563	81.315625	179.310000	1,550,798	0 ( 0.0% )	0 ( 0.0% )	0 ( 0.0% )	509,297 ( 32.8% )

Table 42. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
longitude	100.00%	32.84%	-9.133300	-63.772200

Continuous Data Distribution

Figure 27. Distribution of values in the column.

Outliers

Figure 28. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 29. Visualization of completeness of the data in the column.

Uniqueness

Figure 30. Visualization of uniqueness of the data in the column.

Column: accuracy

Table 43. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Parameter	Content
Column name	accuracy
Description	accuracy of lat/lng from 1=estimated, 4=geonameid, 6=centroid of addresses or shape.
Data type	Integer number
Descriptor	Integer [UID:0.0.NTGER313]
Descriptor description	A number with no fractional part, including the negative and positive numbers as well as zero.
Descriptor target IRI	https://app.pollinatorhub.eu/vocabulary/descriptors/0.0.NTGER313
Unit	n/a

Table 44. Structural analysis of the table. Column name (Name); concise description of the column (Description); data type in which values are stored (Data type); EUPH-Descriptor (Descriptor); unit in which the values are provided (Unit).

Column Name	Range	Mean	Minimum	Q₁	Median	Q₃	Maximum	Total	Missing	Zero	Blank	Distinct
accuracy	1 - 1	3.7	1	3	4	4	6	1,550,798	274,588 ( 17.7% )	0 ( 0.0% )	0 ( 0.0% )	7 ( 0.0% )

Table 45. Quality analysis of the table. Column name (Name); completeness of the column (Completeness); uniqueness of the column (Uniqueness); most common value in the column (Most Common Value); least common value in the column (Least Common Value).

Column Name	Completeness	Uniqueness	Most Common Value	Least Common Value
accuracy	82.29%	0.00%	4	2

Data Distribution Top 20

Figure 31. Distribution of 20 most common values, from highest to lowest.

Continuous Data Distribution

Figure 32. Distribution of values in the column.

Outliers

Figure 33. Visualization of median, min, max, and outliers in the column.

Completeness

Figure 34. Visualization of completeness of the data in the column.

Uniqueness

Figure 35. Visualization of uniqueness of the data in the column.