Version 13 (modified by cedric, 14 years ago) (diff) |
---|
back to first page ..
back to Recipes for EDA CookBook Eda
How to join the CCM and ROE database and avoid projection pb, this is version 2 of the script
- How to join the CCM and ROE database and avoid projection pb, this is …
see Joining ROE and the CCM layer CookBook join ROE_CCM
for the first version.
The trouble is that some dams or electrofishing stations are sometimes badly projected
We should start at step 6
1. putting bd_carthage into postgis
hylcov_arc and hylcov_node are created in a database bd_carthage see CookBook join BDMAP_CCM v2
2. getting distance sea and distance source for bd_carthage
This was also done , though imperfectly (some dist source are missing and there are differerences between ccm and bd_carthage)
CookBook join BDMAP_CCM v2
Finally we have a table called hylcov_arc_dist2 where distance_source are included from Cédric calculataions (noeuds), distance_sea correspond to eda1.0 and strahler correpond to Laurent calculations.
3. Get distance into the roe table
31. Saving table geobs2010.obstacle_referentiel from bd_carthage to eda2.0
cd C:\eda\backup C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump -U postgres -p 5433 -t geobs2010.obstacle_referentiel eda2.0> obstacle_referentiel.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d bd_carthage -p 5433 -U postgres -f obstacle_referentiel.sql REM une erreur quand il essaye de mettre le schéma bd_map mais là on a pas besoin d'avoir les données dans un schéma et d'autres liées à l'absence des autres tables..; Voir par la suite si ce n'est pas handicapant mais normalement non car on ne sort qu'une table faisant la jointure...
at this stage it is usefull to check the geometry column table and correct it eventually
32. project stationsp2 again and get table of all projections within 300
Some stations were missing...
- first create two index to speed up queries
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d bd_carthage -p 5433 -U postgres -c"CREATE INDEX indexobstacle_referentiel ON obstacle_referentiel USING GIST(ref_position_etrs89 GIST_GEOMETRY_OPS);
-- Here we project dams on the bd_carthage -- The result is stored in table roe_bd_carthage -- I'm changing ref_position_etrs89 => the_geom DROP TABLE IF EXISTS roe_bd_carthage; CREATE TABLE roe_bd_carthage as ( SELECT distinct on (ref_id) ref_id, dist_source,dist_sea,strahler,id_trhyd, min(distance) as distance, the_geom FROM ( SELECT ob.ref_id, n.distance_mer as dist_sea,n.strahler ,d.id_trhyd ,d.dist_source_max as dist_source, CAST(distance(d.the_geom, ob.ref_position_etrs89) as decimal(15,1)) as distance,ob.ref_position_etrs89 as the_geom FROM obstacle_referentiel As ob INNER JOIN hylcov_arc_dist2 d ON ST_DWithin(d.the_geom, ob.the_geom,300) LEFT JOIN noeud_troncon_final n ON d.id_som_f = n.id_bdcarthage -- pour récupérer les strahler ORDER BY ob.ref_id) as sub GROUP BY ref_id, distance, dist_source,dist_sea,strahler, id_trhyd, the_geom ); alter table roe_bd_carthage add column id serial; -- mise à jour de la table geometry_columns INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, "type") SELECT '', 'public', 'roe_bd_carthage', 'the_geom', ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom) FROM roe_bd_carthage LIMIT 1; -- creation d'index, clé primaire, et constraintes qui vont bien alter table roe_bd_carthage add CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2); alter table roe_bd_carthage add CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL); alter table roe_bd_carthage add CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 3035); alter table roe_bd_carthage ADD CONSTRAINT pk_id_roe_bd_carthage PRIMARY KEY(id); CREATE INDEX indexroe_bd_carthage ON roe_bd_carthage USING GIST ( the_geom GIST_GEOMETRY_OPS );
33. Saving back table stationsp2 from bd_carthage to eda
cd C:\eda\backup C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump -U postgres -p 5433 -t roe_bd_carthage bd_carthage> roe_bd_carthage.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "drop table if exists geobs2010.roe_bd_carthage" C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -f roe_bd_carthage.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "alter table roe_bd_carthage set schema geobs2010"
4. define criteria helping to do the projection
-- the table geobs2010.roe_ccm_300 was created in CookBook join ROE_CCM but we need to add some columns
-- launch in eda2.0 DROP TABLE IF EXISTS geobs2010.roe_ccm_300_2; CREATE TABLE geobs2010.roe_ccm_300_2 as ( SELECT ref_id, r.cum_len_sea as dist_sea_ccm, r.distance_source as dist_source_ccm, r.strahler as strahler_ccm,gid ,CAST(distance(r.the_geom, s.ref_position_etrs89) as decimal(15,1)) as distance,s.ref_position_etrs89 FROM geobs2010.obstacle_referentiel As s INNER JOIN ccm21.riversegments r ON ST_DWithin(r.the_geom, s.ref_position_etrs89,300) ORDER BY ref_id ); alter table geobs2010.roe_ccm_300_2 add column id serial;
select * from geobs2010.roe_bd_carthage b full outer join geobs2010.roe_ccm_300_2 c on b.ref_id=c.ref_id;
The table from the request joining geobs2010.roe_bd_carthage with the geobs2010.roe_ccm_300 is exported there (use .csv exort to C:\Documents and Settings\cedric\Mes documents\Migrateur\programmes\workspace3.5\EDAdata\dataEDAccm\export_jointure_roe_ccm.csv
Some processing with R and some results in carto_trial.R
We are trying to see if distance_source distance_sea and strahler are consistent between roe and ccm
#--------------------------------------------------------------------------------------------- #--------------------------------------------------------------------------------------------- # analyse des relations entre distance source et distance mer pour les projections sur la ccm # II cas des barrages #--------------------------------------------------------------------------------------------- #--------------------------------------------------------------------------------------------- source("EDACCM/init.r") library(lattice) library(ggplot2) ccm_bd<-read.csv(paste(datawd,"/dataEDAccm/export_jointure_roe_ccm.csv",sep=""),sep=";",stringsAsFactors = FALSE) str(ccm_bd) ccm_bd$dist_source_ccm=ccm_bd$dist_source_ccm/1000 ccm_bd$dist_sea_ccm=ccm_bd$dist_sea_ccm/1000 ccm_bd$pbstrahler=ccm_bd$strahler_ccm>(ccm_bd$strahler+1) # which strahler are larger in ccm than in bd_carthage, allowing diff of one ratio_dist_sea=ccm_bd$dist_sea_ccm/ccm_bd$dist_sea ccm_bd$pb_ratio_dist_sea<-ratio_dist_sea>2&ccm_bd$dist_sea_ccm>100 # la distance mer est le tripe pour la ccm (peut arriver pour les faibles valeurs) layout(matrix(c(1,2), 2, 1, byrow = TRUE)) hist(ccm_bd$dist_source,100) hist(ccm_bd$dist_source,100,ylim=c(0,50)) diffsource<-ccm_bd$dist_source_ccm-ccm_bd$dist_source ratiosource<-ccm_bd$dist_source_ccm/ccm_bd$dist_source ccm_bd$pbdiffsource<-!((diffsource>-20&diffsource< 20)|(ratiosource>0.7&ratiosource<1.3)|ccm_bd$dist_source>100&ccm_bd$dist_source_ccm>100) layout(matrix(c(1,2), 1, 2, byrow = TRUE)) hist(ratio_dist_sea[ratio_dist_sea<4],100,main="ratio dist. sea") abline(v=2,col="red") mtext(paste(round(sum(ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd),2)*100,"% out"),col="red") hist(diffsource,100,main="ratio dist source",100) abline(v=-20,col="red") abline(v=20,col="red") mtext(paste(round(sum(ccm_bd$pbdiffsource,na.rm=TRUE)/nrow(ccm_bd),3)*100,"% out"),col="red") # plots g<-ggplot(ccm_bd) breaks=as.vector(c(1, 2, 5) %o% 10^(-1:3)) g+geom_point(aes(x=dist_source,y=dist_source_ccm,colour=pbdiffsource),alpha=0.5,size=0.3)+ geom_abline(slope=1.3,colour="blue",lty=2)+ geom_abline(slope=0.7,colour="blue",lty=2)+ geom_hline(yintercept=100,colour="green")+ geom_vline(xintercept=100,colour="green")+ #scale_x_log10(name="distance source bd carthage (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000))+ #scale_y_log10(name="distance source ccm (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000)) + geom_abline(slope=1,colour="brown") ### same with a log scale breaks=as.vector(c(1, 2, 5) %o% 10^(-1:3)) g1<-g+geom_point(aes(x=dist_source,y=dist_source_ccm,colour=pbdiffsource,alpha=0.5,size=0.3))+ geom_hline(yintercept=2,colour="purple")+ geom_vline(xintercept=2,colour="purple")+ scale_x_log10(name="distance source bd carthage (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000))+ scale_y_log10(name="distance source ccm (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000)) + geom_abline(slope=1,colour="brown") ccm_bd1<-ccm_bd[ccm_bd$pbstrahler,] g1+geom_point(aes(x=dist_source,y=dist_source_ccm),size=2,data=ccm_bd1,pch=1) g+geom_point(aes(dist_sea,dist_sea_ccm,colour=pb_ratio_dist_sea))+geom_abline(slope=1,colour="red") +geom_abline(slope=1,intercept=1,colour="green") #g+geom_point(aes(dist_sea,dist_sea_ccm,colour=pbdiffsource))+geom_abline(slope=1,colour="red") +geom_abline(slope=1,intercept=1,colour="green") g+geom_jitter(aes(x=strahler,y=strahler_ccm,colour=pbstrahler),alpha=0.3) sum(ccm_bd$pbdiffsource,na.rm=TRUE)/nrow(ccm_bd) # 12 % 7901 sum(ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # 0.0022 148 sum(ccm_bd$pbstrahler,na.rm=TRUE)/nrow(ccm_bd) #2% 1702 #sum(!ccm_bd$pbdiffsource&!ccm_bd$pbstrahler&!ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # 64 % si on garde strahler # finally I don't trust there is a strahler rank pb sum(!ccm_bd$pbdiffsource&!ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # on garde 64% des données # histograms
5. query to select the final data from geobs2010.bdmap_ccm2
select count(*) from geobs2010.bdmap_ccm2; --13695 this one is with repeated projection on ccm riversegments within 300 m select count(*) from geobs2010.bdmap_ccm; --8884 this is without selection (first treatment) for the riversegement corresponding to the lowest proj dist /* REQUEST WITH SELECTION OF CORRECT CRITERIA FOR dist SEA, dist SOURCE, (STRAHLER RANK not taken) */ select * from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0) as sub WHERE (dist_source_ratio is TRUE OR dist_source_ratio IS NULL) AND dist_sea_ratio is TRUE --AND strahler_diff is TRUE; --9837 lines this is with repeated stations but selection has removed ~ 3000 lines /* FINAL REQUEST */ drop table if exists geobs2010.bdmap_ccm_final; create table geobs2010.bdmap_ccm_final as ( select distinct on(st_codecsp) st_codecsp, gid, id_trhyd, dist_source_bdcar, dist_source_ccm, dist_sea_bdcar, dist_sea_ccm, strahler_bdcar, strahler_ccm, distproj_bdcar, min(distproj_ccm) as distproj_ccm, the_geom from( select * from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0 ) as sub WHERE (dist_source_ratio is TRUE or dist_source_ratio is NULL) AND dist_sea_ratio is TRUE --AND strahler_diff is TRUE ) as sub1 group by st_codecsp, gid,id_trhyd,dist_source_bdcar,dist_source_ccm,dist_sea_bdcar,dist_sea_ccm,strahler_bdcar,strahler_ccm,distproj_bdcar, distproj_ccm, the_geom order by st_codecsp ) ; /* * ANALYSIS OF WITHDRAWN NUMBERS */ select count(*) from geobs2010.roe_bd_carthage where dist_source=0 ;-- 0 select count(*) from geobs2010.roe_bd_carthage where dist_sea=0; -- 5 -- voir comment on gère ces cas particuliers par la suite.... select sum(cast(not(dist_sea_ratio) as integer)) as sum_pb_sea, sum(cast(not(dist_source_ratio) as integer)) as sum_pb_source, sum(cast((dist_source_ratio is NULL) as integer)) as sum_pb_source_NULL, sum(cast(not(strahler_diff) as integer)) as sum_pb_strahler from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0 ) as sub; -- 12 (R=12), 1814-169 NULLS (R=2027 including zero), 457 (R=457) OK /* * TABLE TO SHOW THE RESULTS IN A MAP FOR THOSE THAT WERE NOT SELECTED TOO */ drop table if exists geobs2010.bdmap_ccm_full; create table geobs2010.bdmap_ccm_full as ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0);
v0.1 There is a problem of distance source too short for bd_carthage this trouble was already seen on the first figure on this page
Reference for the qgis project for Cédric c/eda/bdmpa/projection_bdmap.qgs En normandie, pas mal de segments manquants.. Pourquoi ? Le ratio distance source est null, l'autoriser Problèmes de calculs bd_carthage sur les rivières en réseau...
10896 station sp2=> 8884 projetées à 300m => 6896 après projection final Après examen détaillé il s'agit plus de calculs de distances sources que de réels problèmes de projection. A vérifier sur les pb de rang de strahler....