wiki:CookBook join BDMAP_CCM

Version 43 (modified by cedric, 15 years ago) (diff)

--

How to join the BDMAP and CCM

back to CookBook Eda ..

  1. Load the BDMAP database, it will be saved by Laurent using a Talend work
  2. load the codier_csp table (it is also necessary for R treatments),the following code will create the table and copy it from the csv file (here located in base) I had to make a utf8 file
    • attachment:ticket:20:BDMAP_CODIER.csv
    • ticket #20
      create schema csp;
      drop table if exists csp.codier;
      create table csp.codier ("cd_id" integer,
      
          "cd_fc_id" integer,
          "cd_code" varchar(30),
          "cd_libc" varchar(30),
          "cd_libl" text,
          "cd_chaine1" varchar(60),
          "cd_chaine2" varchar(30),
          "cd_num1" integer,
          "cd_num2" integer,
          "cd_date1" timestamp,
          "cd_date2" timestamp,
          "cd_dt_cre"timestamp,
          "cd_dt_maj" timestamp,
          "cd_qi_maj" varchar(20),
          CONSTRAINT c_pk_cd_id PRIMARY KEY (cd_id)
          );
      
      copy csp.codier from 'c:/base/BDMAP_CODIER.csv'
      WITH DELIMITER ';'
          CSV HEADER ;
      
  3. Import the shape file with correctly located electrofishing stations
    • here we set the directory where shapes are stored
      cd C:\Documents and Settings\cedric\Mes documents\Migrateur\eda\BDMAP\station arcgis
      
    • we use shp2pgsql to create sql file, see also CookBook shptopostgres for more details and links on the procedure. Gdal does not work as easily so just stick to this recipe.
      C:\"Program Files"\PostgreSQL\8.4\bin\shp2pgsql -s 3035 bdmapv2_vf2 stationsp > stationsp.sql
      
    • We have to convert it into utf8 (as the database is posgis), in notepad, open the stationsp.sql file and convert it to utf8

If you don't have notepad, download it at http://sourceforge.net/projects/notepad-plus/files/notepad%2B%2B%20releases%20binary/

Encodage>convertir en utf8

And save the file as : stationsputf8

  • then we use psql to restore the file while connecting to BDMAP ... change the port if necessary (celine : 5432)
    C:\"Program Files"\PostgreSQL\8.4\bin\psql -d BDMAP -h localhost -U postgres -p 5433 -f stationsputf8.sql 
    
    notes
  • use SRID 27572 if you want to keep the file in the current (LambertII) projection
  • also I'm using 5433 port and you probably use 5432...
  • this was ticket : #21

Press F5 on BDMAP in pgAdminIII, the table stationsp has been created in BDMAP-Schémas-public-Tables (not in BDMAP-Schémas-bdmap-tables ? is it normal ?) > stationsp (20 Colonnes and 4 contraintes)

  1. Now we will join the new station table (with wrong geom but new data) to the old table stationsp ( which has been properly projected by Hélène)
    • We first create a table using the AS SELECT command, it is populated with colums from station, and with columns from stationsp2, note the CASE WHEN syntax, which allow to use station.the_geom when stationsp.the_geom is not present in the table. So in practise we use all the old coordinates plus the new ones which have not been verified. Also noteworthy in the script is the CREATE TABLE ... WITH OIDS. The OIDS are mandatory if you want to display the table in Qgis.
      DROP TABLE IF EXISTS stationsp2;
      CREATE TABLE stationsp2 WITH OIDS AS(
        select 
        station.st_altitude,
        station.st_abcisse,
        station.st_codecsp,
        station.st_codesei,
        station.st_datearret,
        station.st_datecreation,
        station.st_distancesource,
        station.st_distancemer,
        station.st_finalite,
        station.st_imageign,
        station.st_imagedept,
        station.st_lieudit,
        station.st_limites,
        station.st_localisation,
        station.st_longueur,
        station.st_moduleia,
        station.st_cd_naturecourseau,
        station.st_ordonnee,
        station.st_penteign,
        station.st_pkaval,
        station.st_raisremp,
        station.st_sbv,
        station.st_t_janvier,
        station.st_t_juillet,
        station.st_cd_typecourseau,
        station.st_cd_tet,
        station.st_st_id,
        station.st_cm_id,
        station.st_cx_id,
        station.st_th_id,
        station.st_eh_id,
        station.st_uh_id,
        station.st_dt_cre,
        station.st_dt_maj,
        station.st_qi_maj,
        station.st_masseeau,
        stationsp.x,
        stationsp.y,
        stationsp.fnode_,
        stationsp.tnode_,
        stationsp.id_trhyd,
        stationsp.st_id,
        CASE WHEN stationsp.the_geom IS NULL THEN station.the_geom
             ELSE stationsp.the_geom
             END AS the_geom
        FROM stationsp right join bdmap.station on stationsp.st_codecsp=station.st_codecsp)
      COMMENT ON TABLE stationsp2 is 'table BDMAP extraite janvier 2010 et mise à jour avec les coordonnées reprojetées par Hélène'
      


A table has been created in BDMAP-Schémas-public-Tables > stationsp2 (43Colonnes)

  • ah, and now if you want to display the new stations using Qgis, adding a further column saying whether or not the column X is filled is usefull, the result is displayed there : [screenshot:4],
    ALTER TABLE stationsp2 ADD COLUMN newstation boolean;
    UPDATE stationsp2 set newstation=TRUE where x is null;
    UPDATE stationsp2 set newstation=FALSE where x is not null;
    
  • at this point if you have followed the instruction, you will be in the wrong projection if you want to display both the station and stationsp2 layer, you can change this :
    update stationsp2 set the_geom=ST_Transform(the_geom, 27572) ;
    
  • now some results ...
    select count(*) from stationsp2 --10203
    select count(*) from bdmap.station --10203 
    select count(*) from stationsp --8741
    select * from stationsp2 limit 1000 -- this is if you want to have a look at the data
    
  • those are the wrong stations [screenshot:5]
    select * from stationsp2 where st_codecsp IN ('03890162','062B0086','01620140','03270062','03270061','03270063','03270064','062160126','06420055','06070241')
    
    notes
  • this was ticket : #22
  • if you want to drop those stations
    delete from stationsp2 where st_codecsp IN ('03890162','062B0086','01620140','03270062','03270061','03270063','03270064','062160126','06420055','06070241')
    
  1. Changing database

Now we end up with the right file but it is not possible (or say quite complicated) to run a query between databases. So we will save the table and restore it to the new database using dump and psql Save the file to stationsp2.sql

CD where\you\want\to\place\your\table
C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump  -U postgres -p 5433 -t stationsp2 BDMAP> stationsp2.sql

Restore the file in the CCM_EDA database

C:\base>C:\"Program Files"\PostgreSQL\8.4\bin\psql -d CCM_EDA -p 5433 -U postgres -f stationsp2.sql


Don't forget to change the port into 5432 (celine)

  1. Joining the riversegments layer with the stationsp2 layer
  • first create two index to speed up queries

riversegments comes from the CCM_EDA database, so put the sql code in the sql consol : CCM_EDA sur postgres@localhost

CREATE INDEX indexriversegments ON riversegments
  USING GIST ( the_geom GIST_GEOMETRY_OPS );
CREATE INDEX indexStationsp2 ON stationsp2
  USING GIST ( the_geom GIST_GEOMETRY_OPS );
  • if not rightly set, change your geom
    update stationsp2 set the_geom=ST_Transform(the_geom,3035) ;
    
  • the query below can be run in separated parts (for instance you can run the inner SELECT statement to understand the result). This requests creates a table correspondance with 3 columns. the min(gid) is here as I have not found a way arround, but note that min(gid)=gid as there is always one segment for which the distance is minimal.
  • below the method for a bull DO NOT USE IT !

~~
DROP TABLE IF EXISTS correspondance;
CREATE TABLE correspondance as (
	SELECT distinct on (sub.st_codecsp) sub.st_codecsp , gid, sub.distance FROM (-- la clause distinct on car 27 /9299 lignes répétées (deux gid avec meme distance)
	SELECT  st_codecsp, min(distance) as distance FROM (
		SELECT st_codecsp, gid ,CAST(distance(r.the_geom, s.the_geom)as  decimal(15,1)) as distance
		FROM stationsp2 As s
		INNER JOIN  riversegments r ON ST_DWithin(r.the_geom, s.the_geom,1000)
		WHERE s.the_geom IS NOT NULL) AS c1 -- table correspondance
	GROUP BY (st_codecsp)) AS sub
	JOIN (SELECT st_codecsp, gid ,CAST(distance(r.the_geom, s.the_geom)as  decimal(15,1)) as distance
		FROM stationsp2 As s
		INNER JOIN  riversegments r ON ST_DWithin(r.the_geom, s.the_geom,1000)
		WHERE s.the_geom IS NOT NULL) AS c2
	 ON (sub.distance = c2.distance AND sub.st_codecsp=c2.st_codecsp) -- double jointure sur correspondance pour récupérer les gid
);~~

Here the right request. Yhe mistake is that I used group by (X,Y) and one has to use group by X,Y

DROP TABLE IF EXISTS correspondance;
CREATE TABLE correspondance as (
	SELECT distinct on (sub.st_codecsp) sub.st_codecsp , gid, sub.distance FROM (-- la clause distinct on car 27 /9299 lignes répétées (deux gid avec meme distance)
		SELECT st_codecsp, gid ,CAST(distance(r.the_geom, s.the_geom)as  decimal(15,1)) as distance
		FROM stationsp2 As s
		INNER JOIN  riversegments r ON ST_DWithin(r.the_geom, s.the_geom,1000)
		WHERE s.the_geom IS NOT NULL) AS c1 -- table correspondance
	GROUP BY st_codecsp, gid) 

);