wiki:CookBook join BDMAP_CCM

Version 98 (modified by cedric, 15 years ago) (diff)

--

back to first page ..
back to Recipes for EDA CookBook Eda

How to join the BDMAP and CCM

1. Load the BDMAP database, it will be saved by Laurent using a Talend work

2. Load the codier_csp table (it is also necessary for R treatments),the following code will create the table and copy it from the csv file (here located in base)

I had to make a utf8 file

  • attachment:ticket:20:BDMAP_CODIER.csv
  • ticket #20
    create schema csp;
    drop table if exists csp.codier;
    create table csp.codier ("cd_id" integer,
    
        "cd_fc_id" integer,
        "cd_code" varchar(30),
        "cd_libc" varchar(30),
        "cd_libl" text,
        "cd_chaine1" varchar(60),
        "cd_chaine2" varchar(30),
        "cd_num1" integer,
        "cd_num2" integer,
        "cd_date1" timestamp,
        "cd_date2" timestamp,
        "cd_dt_cre"timestamp,
        "cd_dt_maj" timestamp,
        "cd_qi_maj" varchar(20),
        CONSTRAINT c_pk_cd_id PRIMARY KEY (cd_id)
        );
    
    copy csp.codier from 'c:/base/BDMAP_CODIER.csv'
    WITH DELIMITER ';'
        CSV HEADER ;
    

3. Import the shape file with correctly located electrofishing stations

  • here we set the directory where shapes are stored
    cd C:\Documents and Settings\cedric\Mes documents\Migrateur\eda\BDMAP\station arcgis
    
  • we use shp2pgsql to create sql file, see also CookBook shptopostgres for more details and links on the procedure. Gdal does not work as easily so just stick to this recipe. notes
  • use SRID 27572 if you want to keep Posgis informed of the projection you are in, basically it creates and populates the geometry tables with the specified SRID.
    so if you want to use the srid 27572 put :
    C:\"Program Files"\PostgreSQL\8.4\bin\shp2pgsql -s 27572 bdmapv2_vf2 stationsp > stationsp.sql
    
  • We have to convert it into utf8 (as the database is posgis), in notepad, open the stationsp.sql file and convert it to utf8

If you don't have notepad, download it at http://sourceforge.net/projects/notepad-plus/files/notepad%2B%2B%20releases%20binary/

Encodage>convertir en utf8

And save the file as : stationsputf8

  • then we use psql to restore the file while connecting to BDMAP ... change the port if necessary (celine : 5432)
    C:\"Program Files"\PostgreSQL\8.4\bin\psql -d BDMAP -h localhost -U postgres -p 5433 -f stationsputf8.sql 
    
  • also I'm using 5433 port and you probably use 5432...
  • this was ticket : #21

Press F5 on BDMAP in pgAdminIII, the table stationsp has been created in BDMAP-Schémas-public-Tables (not in BDMAP-Schémas-bdmap-tables ? is it normal ?) > stationsp (20 Colonnes and 4 contraintes)

SELECT ST_SRID(the_geom) from stationsp;

returns 27572... We could also have used (instead of -s in shp2psql)

update station set ST_SetSRID(the_geom, 27252); REM 27572

If shp2psql does not populate the geometry column table, you have to do it manually, see CookBook Postgis Geometry column section for more details

INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, "type")
SELECT '', 'public', 'stationsp', 'the_geom', ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom)
FROM stationsp LIMIT 1;

4. Joining tables

  1. Now we will join the new station table (with wrong geom but new data) to the old table stationsp ( which has been properly projected by Hélène)
    • We first create a table using the AS SELECT command, it is populated with colums from station, and with columns from stationsp2, note the CASE WHEN syntax, which allow to use station.the_geom when stationsp.the_geom is not present in the table. So in practise we use all the old coordinates plus the new ones which have not been verified. Also noteworthy in the script is the CREATE TABLE ... WITH OIDS. The OIDS are mandatory if you want to display the table in Qgis.
      DROP TABLE IF EXISTS stationsp2;
      CREATE TABLE stationsp2 WITH OIDS AS(
        select 
        station.st_altitude,
        station.st_abcisse,
        station.st_codecsp,
        station.st_codesei,
        station.st_datearret,
        station.st_datecreation,
        station.st_distancesource,
        station.st_distancemer,
        station.st_finalite,
        station.st_imageign,
        station.st_imagedept,
        station.st_lieudit,
        station.st_limites,
        station.st_localisation,
        station.st_longueur,
        station.st_moduleia,
        station.st_cd_naturecourseau,
        station.st_ordonnee,
        station.st_penteign,
        station.st_pkaval,
        station.st_raisremp,
        station.st_sbv,
        station.st_t_janvier,
        station.st_t_juillet,
        station.st_cd_typecourseau,
        station.st_cd_tet,
        station.st_st_id,
        station.st_cm_id,
        station.st_cx_id,
        station.st_th_id,
        station.st_eh_id,
        station.st_uh_id,
        station.st_dt_cre,
        station.st_dt_maj,
        station.st_qi_maj,
        station.st_masseeau,
        stationsp.x,
        stationsp.y,
        stationsp.fnode_,
        stationsp.tnode_,
        stationsp.id_trhyd,
        stationsp.st_id,
        CASE WHEN stationsp.the_geom IS NULL THEN station.the_geom
             ELSE stationsp.the_geom
             END AS the_geom
        FROM stationsp right join bdmap.station on stationsp.st_codecsp=station.st_codecsp);
      COMMENT ON TABLE stationsp2 is 'table BDMAP extraite janvier 2010 et mise à jour avec les coordonnées reprojetées par Hélène'
      


A table has been created in BDMAP-Schémas-public-Tables > stationsp2 (43Colonnes)

  • ah, and now if you want to display the new stations using Qgis, adding a further column saying whether or not the column X is filled is usefull, the result is displayed there : [screenshot:4],
    ALTER TABLE stationsp2 ADD COLUMN newstation boolean;
    UPDATE stationsp2 set newstation=TRUE where x is null;
    UPDATE stationsp2 set newstation=FALSE where x is not null;
    
    • now some results ...
      select count(*) from stationsp2 --10203
      select count(*) from bdmap.station --10203 
      select count(*) from stationsp --8741
      select * from stationsp2 limit 1000 -- this is if you want to have a look at the data
      
  • those are the wrong stations [screenshot:5]
    select * from stationsp2 where st_codecsp IN ('03890162','062B0086','01620140','03270062','03270061','03270063','03270064','062160126','06420055','06070241')
    
    notes
  • this was ticket : #22
  • if you want to drop those stations
    delete from stationsp2 where st_codecsp IN ('03890162','062B0086','01620140','03270062','03270061','03270063','03270064','062160126','06420055','06070241')
    

5. Changing database

Now we end up with the right file but it is not possible (or say quite complicated) to run a query between databases. So we will save the table and restore it to the new database using dump and psql Save the file to stationsp2.sql

REM Cédric =======================
CD where\you\want\to\place\your\table
C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump  -U postgres -p 5433 -t stationsp2 BDMAP> stationsp2.sql
REM Céline =======================
D:
cd D:\Celine Jouanin\BDMAP
C:\"Program Files"\PostgreSQL\8.4\bin\pg_dump  -U postgres -p 5432 -t stationsp2 BDMAP> stationsp2.sql

Restore the file in the ccm21_eda database, suppress the stationsp2 table if it already exists

REM Cédric =======================
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d ccm21_eda -p 5433 -U postgres -f stationsp2.sql
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "create schema bd_map"
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5433 -U postgres -c "ALTER TABLE stationsp2 set schema bd_map"
REM Céline =======================
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d ccm21_eda -p 5432 -U postgres -f stationsp2.sql
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5432 -U postgres -c "create schema bd_map"
C:\"Program Files"\PostgreSQL\8.4\bin\psql -d eda2.0 -p 5432 -U postgres -c "ALTER TABLE stationsp2 set schema bd_map"

Don't forget to change the port into 5432 (celine)
At this point if you have followed the instruction, you will be in the wrong projection if you want to display both the station and stationsp2 layer, you can change this :

C:\"Program Files"\PostgreSQL\8.4\bin\psql -d ccm21_eda -p 5433 -U postgres -c "update bd_map.stationsp2 set the_geom=ST_Transform(the_geom, 3035);"

to transform the zeros into 3035

C:\"Program Files"\PostgreSQL\8.4\bin\psql -d ccm21_eda -p 5433 -U postgres -c "update bd_map.stationsp2 set the_geom=  ST_SetSRID(the_geom, 3035);"

You will also have to populate the geometry_column table manually

INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, "type")
SELECT '', 'bd_map', 'stationsp2', 'the_geom', ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom)
FROM bd_map.stationsp2 LIMIT 1;
alter table bd_map.stationsp2 add CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2);
alter table bd_map.stationsp2 add  CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL);
alter table bd_map.stationsp2 add  CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 3035);

6. Joining the riversegments layer with the stationsp2 layer

  • first create two index to speed up queries

riversegments comes from the CCM_EDA database, so put the sql code in the sql consol : CCM_EDA sur postgres@localhost

CREATE INDEX indexriversegments ON ccm21.riversegments
  USING GIST ( the_geom GIST_GEOMETRY_OPS );
CREATE INDEX indexStationsp2 ON bd_map.stationsp2
  USING GIST ( the_geom GIST_GEOMETRY_OPS );

Joining the riversegment and stationsp2, the_geom is that of stationsp2

-- creation de la table correspondance, je rajoute un the_geom a la requete pour pouvoir voir la table dans Qgis  
DROP TABLE IF EXISTS bd_map.correspondance;
CREATE TABLE bd_map.correspondance as (
        SELECT distinct on (st_codecsp) st_codecsp, gid, min(distance) as distance, the_geom FROM (
               SELECT st_codecsp, gid ,CAST(distance(r.the_geom, s.the_geom) as  decimal(15,1)) as distance,s.the_geom 
               FROM bd_map.stationsp2 As s
               INNER JOIN  ccm21.riversegments r ON ST_DWithin(r.the_geom, s.the_geom,300)
               WHERE s.the_geom IS NOT NULL
               ORDER BY st_codecsp) AS sub 
        GROUP BY st_codecsp, distance,gid, the_geom
);
alter table bd_map.correspondance add column id serial;
-- mise à jour de la table geometry_columns
INSERT INTO geometry_columns(f_table_catalog, f_table_schema, f_table_name, f_geometry_column, coord_dimension, srid, "type")
SELECT '', 'bd_map', 'correspondance', 'the_geom', ST_CoordDim(the_geom), ST_SRID(the_geom), GeometryType(the_geom)
FROM bd_map.correspondance LIMIT 1;

-- creation d'index, clé primaire, et constraintes qui vont bien
alter table bd_map.correspondance add CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2);
alter table bd_map.correspondance add  CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL);
alter table bd_map.correspondance add  CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 3035);
alter table bd_map.correspondance ADD CONSTRAINT pk_id PRIMARY KEY(id);
CREATE INDEX indexcorrespondance ON bd_map.correspondance
  USING GIST ( the_geom GIST_GEOMETRY_OPS );

select count(*) from bd_map.correspondance ;--9272 -- 8884 a 300m
select count(*) from bd_map.stationsp2 ;--10203
select 100*9272 /10203::float
select 100*8884 /10203::float--87.07
alter table bd_map.correspondance add column id serial;

Below in red old stations not in bd_map.correspondance, green new stations without correspondance, purple correspondance successfull, the map is centered on the Vilaine... source:data/Docs/trac/jointure_ccm_eda.jpg