GBKN NEN1878 to Oracle Loader

June 5, 2013    GBKN GEO GIS NEN1878 SDO_GEOMETRY

In this post I presented NEN1878Reader, a Javascript (browser, NodeJS compatible) to parse a NEN1878 file. A NEN1878 file contains GBKN data, which is topographical data. Topographical data is often in GIS-applications as a background layer to provide context to the user. For example, if a water-pipe runs through a street, you would like to (instantly) see on which side of the street the water-pipe is to be found.

The GBKN data comes in two formats: NEN1878 and Autocad DXF. Some programs can import and/or read Autocad DXF. Most other programs cannot import or read NEN1878 without custom modules. The NEN1878 format is a custom format which is - as far as I know - only found in The Netherlands. Thus, the data has to be converted to be useful for most applications.

The NEN1878 format is a very simple format:

  • The data contains several objects;
  • An object is either a point object, a line object, or a semantic object (text);
  • The format is in plain text;
  • There are several types of records:
    • 4 types of records are meta-data for the file;
    • 3 record types contain the data itself.
  • Each file contains records, where each record 64 bytes long (not including \r\n);
  • Some record types are split into sub-records;
  • An object can consist of several records.

See this NEN1878 format description document (dutch) for a description of the format.

In the past, I have written the gbkn_oracle_loader, written in Javascript/NodeJS. It uses the NEN1878Reader library to read GBKN data in NEN1878 format and write Oracle Spatial loader format. The GBKN data in Oracle Spatial loader format can be imported into Oracle using the sqlldr command. The sqlldr is efficient and achieves high import rates and is a great tool to import complete datasets into Oracle.

On a modern workstation, converting and loading 18GB of NEN1878 data takes just a few hours. I have seen other programs take several days with multiple computers to handle that amount of data.

To use it, make sure you have a working NodeJS. The NodeJS version used during development is 0.6 and 0.8. Your results may vary with NodeJS 0.10. Globally, these are the steps:

  1. Run generate_init_sql.js, this will create a script to create all the tables and sptial-meta data in Oracle.
  2. Run generate_post_sql.js, this will create a script to remove all invalid geometries and create indexes for Oracle Spatial.
  3. Run gbkn_oracle_loader.js with all the NEN1878 input files as arguments. For example: gbkn_oracle_loader.js WEESP_N__7001.NEN. This will create several sqlldr files, each file will become a table in Oracle.

After all the files are converted, perform the following steps:

  1. Run the output/init.sql file.
  2. Run the output/load_all.bat file. This will start sqlldr for all the sqlldr datafiles in the output/ directory.
  3. Run the output/post.sql file.

Now all your data should be read into Oracle in several tables. Review the mappings in the mappings/ directory for the mapping of the GBKN types to Oracle types and other GBKN properties.

The program in itself might not be useful to you directly. It does contain code to convert a geometry to an SDO_GEOMETRY, which might be useful to you. Otherwise, the sqlldr-approach might also be useful in several other situations..