Testing data models

%load_ext watermark
import pandas as pd
import numpy as np
from typing import Type, Optional, Callable
from typing import List, Dict, Union

from review_methods_tests import collect_vitals, find_missing, find_missing_loc_dates
from review_methods_tests import use_gfrags_gfoams_gcaps, make_a_summary,combine_survey_files

import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.colors
from matplotlib.colors import LinearSegmentedColormap, ListedColormap

from setvariables import *

Testing data models#

The methods used in the version of the federal report were tested, but their was not a specific set of validation criteria beforehand. Test were done as the work progressed. This wasted alot of time

here we test the land use and survey data models.

is the land use data complete for each survey location?
does the survey data aggregate correctly to sample level?
- what happens to objects with a quantity of zero?
- aggregating to cantonal, municipal or survey area
  - are all locations included?
  - are lakes and rivers distinguished?
Does the aggregated data for iqaasl match the federal report?

Gfoams, Gfrags, Gcaps#

These are aggregate groups. It is difficult to infer how well a participant differentiates between size or use of the following codes.

Gfrags: G79, G78, G75
Gfoams: G81, G82, G76
Gcaps: G21, G22, G23, G24

These aggregate groups are used when comparing values between sampling campaigns.

Sampling campaigns#

The dates of the sampling campaigns are expanded to include the surveys that happened between large organized campaigns. The start and end dates are defined below.

Attention!! The codes used for each survey campaign are different. Different groups organized and conducted surveys using the MLW protocol. The data was then sent to us.

MCBP: November 2015 - November 2016. The initial sampling campaign. Fragmented plastics (Gfrags/G79/G78/G76) were not sorted by size. All unidentified hard plastic items were classified in this manner.

start_date = 2015-11-15
end_date = 2017-03-31

SLR: April 2017 - May 2018. Sampling campaign by the WWF. Objects less than 2.5 cm were not counted.

start_date = 2017-04-01
end_date = 2020-03-31

IQAASL: April 2020 - May 2021. Sampling campaign mandated by the Swiss confederation. Additional codes were added for regional objects.

start_date = 2020-04-01
end_date = 2021-05-31

Plastock (not added yet): January 2022 - December 2022. Sampling campaign from the Association pour la Sauvegarde du Léman. Not all objects were counted, They only identified a limited number of objects.

Feature name#

The feature name is the name of a river lake or other regional label that you would find on a map. People in the region know the name.

Feature type#

The feature type is a label that applies to general conditions of use for the location and other locations in the region

r: rivers: surveys on river banks
l: lake: surveys on the lake shore
p: parcs: surveys in recreational areas

Parent boundary#

Designates the larger geographic region of the survey location. For lakes and rivers it is the name of the catchment area or river basin. For parcs it is the the type of park ie.. les Alpes. Recall that each feature has a name, for example Alpes Lépontines is the the name of a feature in the geographic region of Les Alpes.

Aggregate a set of data by sample (location and date)#

Use the loc_date column in the survey data. Use the IQAASL period and four river baisns test against the federal report.

Before aggregating does the number of locations, cities, samples and quantity match the federal report?#

The feature types include lakes and rivers, alpes were condsidered separately

From https://hammerdirt-analyst.github.io/IQAASL-End-0f-Sampling-2021/lakes_rivers.html#

cities = yes
samples = yes
locations = yes
quantity = No it is short 50 pieces
start and end date = yes

    Number of objects: 54694
    
    Median pieces/meter: 0.0
    
    Number of samples: 386
    
    Number of unique codes: 235
    
    Number of sample locations: 143
    
    Number of features: 28
    
    Number of cities: 77
    
    Start date: 2020-03-08
    
    End date: 2021-05-12

# when the codes are changed to gfrags, gfoams and gcaps that creates 
# multiple code results for the same code at the same sample
# note that the code_result_columns do not have the groupname column
# this is because the code is changed and not the groupname
code_result_df = aggregate_dataframe(feature_data.copy(), code_result_columns, unit_agg)
code_result_df = code_result_df.merge(codes.groupname, left_on="code", right_index=True)
code_result_df = code_result_df.merge(beaches[["canton","feature_type"]], left_on='slug', right_index=True, validate="many_to_one")

Number of lakes, rivers, parcs, cities and cantons#

	Échantillons	Municipalités	Lacs	Rivières	Quantité
St. Gallen	38	5	2	3	3'614
Aargau	4	4	0	2	101
Bern	88	21	4	4	8'786
Solothurn	3	2	0	1	66
Vaud	87	14	2	2	17'414
Tessin	28	7	2	3	3'023
Genève	20	2	1	1	4'962
Neuchâtel	16	4	2	0	2'375
Glarus	16	2	1	2	1'016
Valais	15	5	1	1	7'638
Zürich	49	6	1	3	4'543
Fribourg	14	2	1	0	930
Schwyz	1	1	1	0	104
Zug	4	2	1	1	64
Luzern	3	1	1	0	58

aggregate to sample#

The assessments are made on a per sample basis. That means that we can look at an individual object value at each sample. The sum of all the individual objects in a survey is the total for that survey. Dividing the totals by the length of the survey gives the assessment metric: pieces of trash per meter.

Are the quantiles of the current data = to the federal report? Yes
Are the material totals = to the federal report? No,plastics if off by 50 pcs
Are the fail rates of the most common objects = to the federal report? Yes
Is the % of total of the most common objects = to the fedral report? yes
Is the median pieces/meter of the most common objects = to the federal report? yes
Is the quantity of the most common objects = to the federal report? yes

The summary of survey totals#

fig 1.5 in IQAASL

	Pcs/M
Échantillons	386
Moyenne	3,95
Écart-Type	7,06
Min	0,02
25%	0,82
50%	1,90
75%	3,87
Max	66,17
Total	54'694

Material totals and proportions#

fig 1.5 iqaal

	Quantité	% Du Total
Chimique	140	0,00
Tissu	343	0,01
Verre	2'919	0,05
Métal	1'874	0,03
Papier	1'527	0,03
Plastique	47'093	0,86
Caoutchouc	390	0,01
Non-Identifié	2	0,00
Bois	406	0,01

Quantity, median pcs/m, fail rate, and % of total#

Sumary results for all the codes in the parent_boundary

The most common objects#

fig 1.6 iqaasl

	Quantité	% Du Total	Pcs/M	Taux D'Échec
Mégots Et Filtres À Cigarettes	8'485	0,16	0,20	0,88
Fragments De Plastique: G80, G79, G78, G75	7'400	0,14	0,18	0,86
Fragments De Polystyrène Expansé: G76, G81, G82, G83	5'559	0,10	0,05	0,69
Emballages De Bonbons, De Snacks	3'325	0,06	0,09	0,85
Bâche, Feuille Plastique Industrielle	2'534	0,05	0,05	0,70
Verre Brisé	2'136	0,04	0,03	0,65
Pellets Industriels (Gpi)	1'968	0,04	0,00	0,31
Couvercles En Plastique Bouteille: G21, G22, G23, G24	1'844	0,03	0,03	0,65
Mousse De Plastique Pour L'Isolation Thermique	1'656	0,03	0,01	0,53
Coton-Tige	1'406	0,03	0,01	0,51
Polystyrène < 5Mm	1'209	0,02	0,00	0,26
Déchets De Construction En Plastique	992	0,02	0,01	0,52
Bouchons De Bouteilles En Métal, Couvercles Et Tirettes	700	0,01	0,01	0,52

Results by groupname and feature boundary#

	Linth	Aare	Rhône	Ticino	Cumulé
Agriculture	0,03	0,06	0,14	0,06	0,07
Nourriture Et Boissons	0,28	0,25	0,70	0,28	0,34
Infrastructures	0,12	0,14	0,55	0,21	0,20
Micro-Plastiques (< 5Mm)	0,00	0,01	0,11	0,00	0,01
Emballage Non Alimentaire	0,13	0,09	0,21	0,08	0,13
Articles Personnels	0,04	0,04	0,10	0,07	0,06
Morceaux De Plastique	0,11	0,18	0,48	0,10	0,18
Loisirs	0,04	0,06	0,17	0,04	0,06
Tabac	0,27	0,15	0,50	0,18	0,25
Non Classé	0,00	0,00	0,02	0,00	0,00
Eaux Usées	0,01	0,03	0,19	0,02	0,03

Most common codes by feature boundary#

	Linth	Aare	Rhône	Ticino	Cumulé
Pellets Industriels (Gpi)	0,00	0,00	0,00	0,00	0,00
Polystyrène < 5Mm	0,00	0,00	0,00	0,00	0,00
Bouchons De Bouteilles En Métal, Couvercles Et Tirettes	0,01	0,00	0,03	0,01	0,01
Verre Brisé	0,04	0,03	0,02	0,08	0,03
Mégots Et Filtres À Cigarettes	0,23	0,11	0,42	0,15	0,20
Emballages De Bonbons, De Snacks	0,06	0,08	0,19	0,04	0,09
Bâche, Feuille Plastique Industrielle	0,02	0,05	0,09	0,04	0,05
Mousse De Plastique Pour L'Isolation Thermique	0,00	0,00	0,07	0,03	0,01
Déchets De Construction En Plastique	0,00	0,00	0,06	0,03	0,01
Coton-Tige	0,00	0,00	0,11	0,00	0,01
Couvercles En Plastique Bouteille: G21, G22, G23, G24	0,03	0,02	0,10	0,00	0,03
Fragments De Polystyrène Expansé: G76, G81, G82, G83	0,03	0,04	0,17	0,05	0,05
Fragments De Plastique: G80, G79, G78, G75	0,11	0,18	0,48	0,10	0,18

Most common codes by canton#

	Aargau	Bern	Fribourg	Genève	Glarus	Luzern	Neuchâtel	Schwyz	Solothurn	St. Gallen	Tessin	Valais	Vaud	Zug	Zürich	Cumulé
Pellets Industriels (Gpi)	0,00	0,00	0,00	0,03	0,00	0,00	0,00	0,16	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00
Polystyrène < 5Mm	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00	0,00
Bouchons De Bouteilles En Métal, Couvercles Et Tirettes	0,00	0,00	0,00	0,07	0,01	0,00	0,05	0,00	0,03	0,00	0,01	0,02	0,02	0,00	0,05	0,01
Verre Brisé	0,00	0,03	0,03	0,01	0,00	0,00	0,21	0,00	0,00	0,05	0,08	0,00	0,03	0,00	0,12	0,03
Mégots Et Filtres À Cigarettes	0,01	0,11	0,11	0,17	0,17	0,08	0,33	0,00	0,06	0,23	0,15	0,06	0,47	0,07	0,46	0,20
Emballages De Bonbons, De Snacks	0,03	0,08	0,06	0,16	0,07	0,05	0,12	0,20	0,02	0,10	0,04	0,50	0,12	0,01	0,07	0,09
Bâche, Feuille Plastique Industrielle	0,00	0,08	0,00	0,00	0,04	0,00	0,07	0,15	0,00	0,11	0,04	0,53	0,08	0,00	0,00	0,05
Mousse De Plastique Pour L'Isolation Thermique	0,00	0,01	0,00	0,01	0,01	0,00	0,01	0,00	0,00	0,00	0,03	0,28	0,06	0,00	0,00	0,01
Déchets De Construction En Plastique	0,00	0,01	0,00	0,00	0,01	0,00	0,04	0,00	0,00	0,01	0,03	0,44	0,04	0,01	0,00	0,01
Coton-Tige	0,00	0,00	0,00	0,02	0,00	0,00	0,02	0,01	0,03	0,00	0,00	0,68	0,08	0,00	0,00	0,01
Couvercles En Plastique Bouteille: G21, G22, G23, G24	0,00	0,03	0,00	0,04	0,01	0,00	0,02	0,08	0,00	0,05	0,00	0,96	0,09	0,03	0,03	0,03
Fragments De Polystyrène Expansé: G76, G81, G82, G83	0,03	0,06	0,04	0,01	0,08	0,04	0,02	0,01	0,00	0,08	0,05	1,06	0,18	0,00	0,00	0,05
Fragments De Plastique: G80, G79, G78, G75	0,04	0,25	0,06	0,20	0,07	0,00	0,21	0,08	0,06	0,22	0,10	2,04	0,42	0,00	0,10	0,18

Most common codes: canton-municipal#

Bern#

	Beatenberg	Bern	Biel/Bienne	Brienz (Be)	Brügg	Burgdorf	Bönigen	Erlach	Gals	Kallnach	Köniz	Ligerz	Lüscherz	Nidau	Port	Rubigen	Spiez	Thun	Unterseen	Vinelz	Walperswil	Cumulé
Pellets Industriels (Gpi)	0,03	0,00	0,04	0,00	0,00	0,00	0,04	0,00	0,00	0,10	0,00	0,00	0,00	0,08	0,01	0,00	0,00	0,00	0,00	0,10	0,00	0,00
Polystyrène < 5Mm	0,00	0,00	0,06	0,00	0,00	0,00	0,00	0,00	0,17	0,00	0,00	0,00	0,00	0,04	0,00	0,00	0,00	0,00	0,02	0,00	0,00	0,00
Bouchons De Bouteilles En Métal, Couvercles Et Tirettes	0,00	0,00	0,02	0,02	0,00	0,00	0,00	0,02	0,06	0,03	0,00	0,07	0,00	0,00	0,03	0,00	0,00	0,00	0,00	0,00	0,00	0,00
Verre Brisé	0,02	0,00	0,05	0,00	0,36	0,00	0,00	0,02	0,07	0,00	0,00	1,00	0,20	0,12	0,01	0,00	0,13	0,00	0,02	0,08	0,00	0,03
Mégots Et Filtres À Cigarettes	0,55	0,01	0,81	0,00	0,28	0,07	1,19	0,44	0,09	0,00	0,09	0,76	0,04	0,00	0,63	0,06	0,04	0,23	0,55	0,04	0,00	0,11
Emballages De Bonbons, De Snacks	0,12	0,01	0,34	0,39	0,00	0,02	0,06	0,05	0,12	0,08	0,00	0,78	0,02	0,60	0,07	0,11	0,00	0,09	0,11	0,18	0,00	0,08
Bâche, Feuille Plastique Industrielle	0,03	0,01	0,17	0,67	0,00	0,22	0,15	0,00	0,03	0,32	0,00	1,19	0,05	0,40	0,00	0,00	0,01	0,13	0,13	0,37	0,00	0,08
Mousse De Plastique Pour L'Isolation Thermique	0,42	0,00	0,05	0,02	0,00	0,00	0,10	0,00	0,00	0,00	0,00	0,21	0,00	0,00	0,01	0,00	0,00	0,00	0,09	0,00	0,19	0,01
Déchets De Construction En Plastique	0,00	0,00	0,06	0,00	0,00	0,00	0,04	0,02	0,00	0,00	0,00	0,00	0,00	0,04	0,00	0,00	0,00	0,00	0,01	0,07	0,00	0,01
Coton-Tige	0,04	0,00	0,05	0,06	0,00	0,00	0,06	0,12	0,06	0,07	0,00	0,12	0,00	0,00	0,00	0,00	0,00	0,01	0,02	0,00	0,00	0,00
Couvercles En Plastique Bouteille: G21, G22, G23, G24	0,12	0,00	0,08	0,28	0,00	0,00	0,04	0,06	0,00	0,06	0,00	0,06	0,00	0,04	0,03	0,06	0,00	0,02	0,04	0,08	0,03	0,03
Fragments De Polystyrène Expansé: G76, G81, G82, G83	0,18	0,00	0,16	0,22	0,00	0,00	0,07	0,02	0,00	0,14	0,00	0,06	0,00	0,04	0,00	0,00	0,07	0,16	0,11	0,00	0,00	0,06
Fragments De Plastique: G80, G79, G78, G75	0,44	0,01	0,48	0,39	0,03	0,02	1,01	0,49	0,28	0,12	0,00	1,94	0,14	0,64	0,22	0,00	0,04	0,24	0,21	1,28	0,00	0,25

Valais#

	Saint-Gingolph	Riddes	Sion	Leuk	Salgesch	Cumulé
Pellets Industriels (Gpi)	0,03	0,00	0,00	0,00	0,00	0,00
Polystyrène < 5Mm	0,18	0,00	0,00	0,00	0,00	0,00
Bouchons De Bouteilles En Métal, Couvercles Et Tirettes	0,03	0,03	0,00	0,00	0,00	0,02
Verre Brisé	0,03	0,00	0,00	0,00	0,00	0,00
Mégots Et Filtres À Cigarettes	0,13	0,00	0,04	0,00	0,00	0,06
Emballages De Bonbons, De Snacks	0,77	0,00	0,02	0,00	0,00	0,50
Bâche, Feuille Plastique Industrielle	0,92	0,00	0,00	0,06	0,30	0,53
Mousse De Plastique Pour L'Isolation Thermique	0,58	0,00	0,00	0,00	0,00	0,28
Déchets De Construction En Plastique	0,85	0,00	0,00	0,08	0,00	0,44
Coton-Tige	1,25	0,00	0,00	0,00	0,00	0,68
Couvercles En Plastique Bouteille: G21, G22, G23, G24	1,66	0,00	0,00	0,00	0,00	0,96
Fragments De Polystyrène Expansé: G76, G81, G82, G83	3,34	0,00	0,02	0,02	0,00	1,06
Fragments De Plastique: G80, G79, G78, G75	2,56	0,00	0,06	0,02	0,00	2,04

Author: hammerdirt-analyst

conda environment: cantonal_report

numpy     : 1.25.2
matplotlib: 3.7.1
pandas    : 2.0.3