back to first page [..][[BR]] back to Recipes for EDA ["CookBook Eda"] = How to join the CCM and ROE database and avoid projection pb, this is version 2 of the script = [[PageOutline]] see Joining ROE and the CCM layer ["CookBook join ROE_CCM"][[BR]]for the first version. [[BR]] The trouble is that some dams or electrofishing stations are sometimes badly projected [[BR]] We should start at step 6 [[BR]] == 1. putting bd_carthage into postgis == hylcov_arc and hylcov_node are created in a database bd_carthage see [wiki:"CookBook join BDMAP_CCM v2"] [[BR]] == 2. getting distance sea and distance source for bd_carthage == This was also done , though imperfectly (some dist source are missing and there are differerences between ccm and bd_carthage) [wiki:"CookBook join BDMAP_CCM v2"] [[BR]] Finally we have a table called ''' hylcov_arc_dist2 ''' where distance_source are included from Cédric calculataions (noeuds), distance_sea correspond to eda1.0 and strahler correpond to Laurent calculations. == 3. Get distance into the roe table === 31. Saving table geobs2010.obstacle_referentiel from bd_carthage to eda2.0 === {{{ cd C:\eda\backup C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump -U postgres -p 5433 -t geobs2010.obstacle_referentiel eda2.0> obstacle_referentiel.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d bd_carthage -p 5433 -U postgres -f obstacle_referentiel.sql REM une erreur quand il essaye de mettre le schéma bd_map mais là on a pas besoin d'avoir les données dans un schéma et d'autres liées à l'absence des autres tables..; Voir par la suite si ce n'est pas handicapant mais normalement non car on ne sort qu'une table faisant la jointure... }}} at this stage it is usefull to check the geometry column table and correct it eventually === 32. project stationsp2 again and get table of all projections within 300 === Some stations were missing... * first create two index to speed up queries {{{ C:\"Program Files"\PostgreSQL\8.4\bin\psql -d bd_carthage -p 5433 -U postgres -c"CREATE INDEX indexobstacle_referentiel ON obstacle_referentiel USING GIST(ref_position_etrs89 GIST_GEOMETRY_OPS); }}} {{{ #!sql -- Here we project dams on the bd_carthage -- The result is stored in table roe_bd_carthage -- I'm changing ref_position_etrs89 => the_geom DROP TABLE IF EXISTS roe_bd_carthage; CREATE TABLE roe_bd_carthage as ( SELECT distinct on (ref_id) ref_id, dist_source,dist_sea,strahler,id_trhyd, min(distance) as distance, the_geom FROM ( SELECT ob.ref_id, n.distance_mer as dist_sea,n.strahler ,d.id_trhyd ,d.dist_source_max as dist_source, CAST(distance(d.the_geom, ob.ref_position_etrs89) as decimal(15,1)) as distance,ob.ref_position_etrs89 as the_geom FROM obstacle_referentiel As ob INNER JOIN hylcov_arc_dist2 d ON ST_DWithin(d.the_geom, ob.the_geom,300) LEFT JOIN noeud_troncon_final n ON d.id_som_f = n.id_bdcarthage -- pour récupérer les strahler ORDER BY ob.ref_id) as sub GROUP BY ref_id, distance, dist_source,dist_sea,strahler, id_trhyd, the_geom ); alter table roe_bd_carthage add column id serial; -- mise à jour de la table geometry_columns INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, "type") SELECT '', 'public', 'roe_bd_carthage', 'the_geom', ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom) FROM roe_bd_carthage LIMIT 1; -- creation d'index, clé primaire, et constraintes qui vont bien alter table roe_bd_carthage add CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2); alter table roe_bd_carthage add CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL); alter table roe_bd_carthage add CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 3035); alter table roe_bd_carthage ADD CONSTRAINT pk_id_roe_bd_carthage PRIMARY KEY(id); CREATE INDEX indexroe_bd_carthage ON roe_bd_carthage USING GIST ( the_geom GIST_GEOMETRY_OPS ); }}} === 33. Saving back table stationsp2 from bd_carthage to eda === {{{ cd C:\eda\backup C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump -U postgres -p 5433 -t roe_bd_carthage bd_carthage> roe_bd_carthage.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "drop table if exists geobs2010.roe_bd_carthage" C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -f roe_bd_carthage.sql C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "alter table roe_bd_carthage set schema geobs2010" }}} == 4. define criteria helping to do the projection == -- the table geobs2010.roe_ccm_300 was created in ["CookBook join ROE_CCM"] but we need to add some columns {{{ -- launch in eda2.0 DROP TABLE IF EXISTS geobs2010.roe_ccm_300_2; CREATE TABLE geobs2010.roe_ccm_300_2 as ( SELECT ref_id, r.cum_len_sea as dist_sea_ccm, r.distance_source as dist_source_ccm, r.strahler as strahler_ccm,gid ,CAST(distance(r.the_geom, s.ref_position_etrs89) as decimal(15,1)) as distance,s.ref_position_etrs89 FROM geobs2010.obstacle_referentiel As s INNER JOIN ccm21.riversegments r ON ST_DWithin(r.the_geom, s.ref_position_etrs89,300) ORDER BY ref_id ); alter table geobs2010.roe_ccm_300_2 add column id serial; }}} {{{ #!sql select * from geobs2010.roe_bd_carthage b full outer join geobs2010.roe_ccm_300_2 c on b.ref_id=c.ref_id; }}} The table from the request joining geobs2010.roe_bd_carthage with the geobs2010.roe_ccm_300 is exported there (use .csv exort to C:\Documents and Settings\cedric\Mes documents\Migrateur\programmes\workspace3.5\EDAdata\dataEDAccm\export_jointure_roe_ccm.csv[[BR]] source:"data/dataEDAccm/export_jointure_roe_ccm.csv" [[BR]] Some processing with R and some results in carto_trial.R [[BR]] We are trying to see if distance_source distance_sea and strahler are consistent between roe and ccm {{{ #--------------------------------------------------------------------------------------------- #--------------------------------------------------------------------------------------------- # analyse des relations entre distance source et distance mer pour les projections sur la ccm # II cas des barrages #--------------------------------------------------------------------------------------------- #--------------------------------------------------------------------------------------------- source("EDACCM/init.r") library(lattice) library(ggplot2) ccm_bd<-read.csv(paste(datawd,"/dataEDAccm/export_jointure_roe_ccm.csv",sep=""),sep=";",stringsAsFactors = FALSE) str(ccm_bd) ccm_bd$dist_source_ccm=ccm_bd$dist_source_ccm/1000 ccm_bd$dist_sea_ccm=ccm_bd$dist_sea_ccm/1000 ccm_bd$pbstrahler=ccm_bd$strahler_ccm>(ccm_bd$strahler+1) # which strahler are larger in ccm than in bd_carthage, allowing diff of one ratio_dist_sea=ccm_bd$dist_sea_ccm/ccm_bd$dist_sea ccm_bd$pb_ratio_dist_sea<-ratio_dist_sea>2&ccm_bd$dist_sea_ccm>100 # la distance mer est le tripe pour la ccm (peut arriver pour les faibles valeurs) layout(matrix(c(1,2), 2, 1, byrow = TRUE)) hist(ccm_bd$dist_source,100) hist(ccm_bd$dist_source,100,ylim=c(0,50)) diffsource<-ccm_bd$dist_source_ccm-ccm_bd$dist_source ratiosource<-ccm_bd$dist_source_ccm/ccm_bd$dist_source ccm_bd$pbdiffsource<-!((diffsource>-20&diffsource< 20)|(ratiosource>0.7&ratiosource<1.3)|ccm_bd$dist_source>100&ccm_bd$dist_source_ccm>100) layout(matrix(c(1,2), 1, 2, byrow = TRUE)) hist(ratio_dist_sea[ratio_dist_sea<4],100,main="ratio dist. sea") abline(v=2,col="red") mtext(paste(round(sum(ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd),2)*100,"% out"),col="red") hist(diffsource,100,main="ratio dist source",100) abline(v=-20,col="red") abline(v=20,col="red") mtext(paste(round(sum(ccm_bd$pbdiffsource,na.rm=TRUE)/nrow(ccm_bd),3)*100,"% out"),col="red") # plots g<-ggplot(ccm_bd) breaks=as.vector(c(1, 2, 5) %o% 10^(-1:3)) g+geom_point(aes(x=dist_source,y=dist_source_ccm,colour=pbdiffsource),alpha=0.5,size=0.3)+ geom_abline(slope=1.3,colour="blue",lty=2)+ geom_abline(slope=0.7,colour="blue",lty=2)+ geom_hline(yintercept=100,colour="green")+ geom_vline(xintercept=100,colour="green")+ #scale_x_log10(name="distance source bd carthage (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000))+ #scale_y_log10(name="distance source ccm (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000)) + geom_abline(slope=1,colour="brown") ### same with a log scale breaks=as.vector(c(1, 2, 5) %o% 10^(-1:3)) g1<-g+geom_point(aes(x=dist_source,y=dist_source_ccm,colour=pbdiffsource,alpha=0.5,size=0.3))+ geom_hline(yintercept=2,colour="purple")+ geom_vline(xintercept=2,colour="purple")+ scale_x_log10(name="distance source bd carthage (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000))+ scale_y_log10(name="distance source ccm (log scale)",breaks = breaks, labels = breaks,limits=c(1,1000)) + geom_abline(slope=1,colour="brown") ccm_bd1<-ccm_bd[ccm_bd$pbstrahler,] g1+geom_point(aes(x=dist_source,y=dist_source_ccm),size=2,data=ccm_bd1,pch=1) g+geom_point(aes(dist_sea,dist_sea_ccm,colour=pb_ratio_dist_sea))+geom_abline(slope=1,colour="red") +geom_abline(slope=1,intercept=1,colour="green") #g+geom_point(aes(dist_sea,dist_sea_ccm,colour=pbdiffsource))+geom_abline(slope=1,colour="red") +geom_abline(slope=1,intercept=1,colour="green") g+geom_jitter(aes(x=strahler,y=strahler_ccm,colour=pbstrahler),alpha=0.3) sum(ccm_bd$pbdiffsource,na.rm=TRUE)/nrow(ccm_bd) # 12 % 7901 sum(ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # 0.0022 148 sum(ccm_bd$pbstrahler,na.rm=TRUE)/nrow(ccm_bd) #2% 1702 #sum(!ccm_bd$pbdiffsource&!ccm_bd$pbstrahler&!ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # 64 % si on garde strahler # finally I don't trust there is a strahler rank pb sum(!ccm_bd$pbdiffsource&!ccm_bd$pb_ratio_dist_sea,na.rm=TRUE)/nrow(ccm_bd) # on garde 64% des données # histograms }}} == 5. query to select the final data from geobs2010.bdmap_ccm2 == {{{ #!sql select count(*) from geobs2010.bdmap_ccm2; --13695 this one is with repeated projection on ccm riversegments within 300 m select count(*) from geobs2010.bdmap_ccm; --8884 this is without selection (first treatment) for the riversegement corresponding to the lowest proj dist /* REQUEST WITH SELECTION OF CORRECT CRITERIA FOR dist SEA, dist SOURCE, (STRAHLER RANK not taken) */ select * from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0) as sub WHERE (dist_source_ratio is TRUE OR dist_source_ratio IS NULL) AND dist_sea_ratio is TRUE --AND strahler_diff is TRUE; --9837 lines this is with repeated stations but selection has removed ~ 3000 lines /* FINAL REQUEST */ drop table if exists geobs2010.bdmap_ccm_final; create table geobs2010.bdmap_ccm_final as ( select distinct on(st_codecsp) st_codecsp, gid, id_trhyd, dist_source_bdcar, dist_source_ccm, dist_sea_bdcar, dist_sea_ccm, strahler_bdcar, strahler_ccm, distproj_bdcar, min(distproj_ccm) as distproj_ccm, the_geom from( select * from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0 ) as sub WHERE (dist_source_ratio is TRUE or dist_source_ratio is NULL) AND dist_sea_ratio is TRUE --AND strahler_diff is TRUE ) as sub1 group by st_codecsp, gid,id_trhyd,dist_source_bdcar,dist_source_ccm,dist_sea_bdcar,dist_sea_ccm,strahler_bdcar,strahler_ccm,distproj_bdcar, distproj_ccm, the_geom order by st_codecsp ) ; /* * ANALYSIS OF WITHDRAWN NUMBERS */ select count(*) from geobs2010.roe_bd_carthage where dist_source=0 ;-- 0 select count(*) from geobs2010.roe_bd_carthage where dist_sea=0; -- 5 -- voir comment on gère ces cas particuliers par la suite.... select sum(cast(not(dist_sea_ratio) as integer)) as sum_pb_sea, sum(cast(not(dist_source_ratio) as integer)) as sum_pb_source, sum(cast((dist_source_ratio is NULL) as integer)) as sum_pb_source_NULL, sum(cast(not(strahler_diff) as integer)) as sum_pb_strahler from ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0 ) as sub; -- 12 (R=12), 1814-169 NULLS (R=2027 including zero), 457 (R=457) OK /* * TABLE TO SHOW THE RESULTS IN A MAP FOR THOSE THAT WERE NOT SELECTED TOO */ drop table if exists geobs2010.bdmap_ccm_full; create table geobs2010.bdmap_ccm_full as ( select b.st_codecsp, b.dist_source as dist_source_bdcar, c.dist_source_ccm/1000 as dist_source_ccm, (((c.dist_source_ccm/(1000*b.dist_source))<1.3) AND (c.dist_source_ccm/(1000*b.dist_source))>0.7) OR ((c.dist_source_ccm/1000-b.dist_source)> -20 AND (c.dist_source_ccm/1000-b.dist_source)<20) AS dist_source_ratio, b.dist_sea as dist_sea_bdcar, c.dist_sea_ccm/1000 as dist_sea_ccm, NOT((c.dist_sea_ccm/(1000*b.dist_sea))>2 and c.dist_sea_ccm>100000) as dist_sea_ratio, b.strahler as strahler_bdcar, c.strahler_ccm as strahler_ccm, (c.strahler_ccm-b.strahler)<=1 as strahler_diff, b.distance as distproj_bdcar, c.distance as distproj_ccm, id_trhyd, gid, c.the_geom from geobs2010.roe_bd_carthage b join geobs2010.bdmap_ccm2 c on b.st_codecsp=c.st_codecsp where b.dist_sea>0); }}} v0.1 There is a problem of distance source too short for bd_carthage this trouble was already seen on the first figure on this page [[BR]] [[Image(source:data/Docs/trac/BDMAP/stations_selection_1.jpg,700px)]] [[BR]] Reference for the qgis project for Cédric ''' c/eda/bdmpa/projection_bdmap.qgs ''' En normandie, pas mal de segments manquants.. Pourquoi ? Le ratio distance source est null, l'autoriser Problèmes de calculs bd_carthage sur les rivières en réseau... 10896 station sp2=> 8884 projetées à 300m => 6896 après projection final Après examen détaillé il s'agit plus de calculs de distances sources que de réels problèmes de projection. A vérifier sur les pb de rang de strahler....