Load CSV files into the database
Recently, I wanted to display the places of residence of the members of my club on a map for a personal gimmick (IGOC members). I knew the addresses of the club members. But not the coordinates of their places of residence.
So I went in search of the coordinates and found what I was looking for at the Federal Office of Topography (swisstopo).
The data is available there as a CSV file. Details here: Swiss town coordinates.
How do I load this data into a database?
Loading the data with MariaDB/MySQL
MariaDB and MySQL have the LOAD DATA INFILE command:
SQL> DROP TABLE IF EXISTS wgs84;
SQL> -- SET GLOBAL local_infile = ON; -- Only needed with MySQL
SQL> CREATE TABLE wgs84 (
ortschaftsname VARCHAR(32)
, plz4 SMALLINT
, zusatzziffer SMALLINT
, zip_id SMALLINT UNSIGNED
, gemeindename VARCHAR(32)
, bfs_nr SMALLINT
, kantonskuerzel CHAR(2)
, adressenanteil varchar(8)
, e DOUBLE
, n DOUBLE
, sprache VARCHAR(8)
, validity VARCHAR(12)
);
SQL> -- TRUNCATE TABLE wgs84;
SQL> LOAD DATA LOCAL INFILE '/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv'
INTO TABLE wgs84
FIELDS TERMINATED BY ';'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
;
Query OK, 5713 rows affected
Records: 5713 Deleted: 0 Skipped: 0 Warnings: 0
You can then query the data in the database:
SQL> SELECT * FROM wgs84 ORDER BY ortschaftsname LIMIT 5;
+----------------+------+--------------+--------+--------------+--------+----------------+----------------+-------------------+--------------------+---------+------------+
| ortschaftsname | plz4 | zusatzziffer | zip_id | gemeindename | bfs_nr | kantonskuerzel | adressenanteil | e | n | sprache | validity |
+----------------+------+--------------+--------+--------------+--------+----------------+----------------+-------------------+--------------------+---------+------------+
| Aadorf | 8355 | 0 | 4672 | Aadorf | 4551 | TG | 96.802 % | 8.903193007810433 | 47.491079014637265 | de | 2008-07-01 |
| Aadorf | 8355 | 0 | 4672 | Elgg | 294 | ZH | 3.198 % | 8.89206766645808 | 47.4933781685032 | de | 2008-07-01 |
| Aarau | 5000 | 0 | 2913 | Aarau | 4001 | AG | 99.713 % | 8.048148371736266 | 47.38973523857376 | de | 2008-07-01 |
| Aarau | 5000 | 0 | 2913 | Suhr | 4012 | AG | 0.287 % | 8.059410934099922 | 47.383298214804334 | de | 2008-07-01 |
| Aarau | 5004 | 0 | 2932 | Aarau | 4001 | AG | 100 % | 8.060698546432551 | 47.400587704180744 | de | 2008-07-01 |
+----------------+------+--------------+--------+--------------+--------+----------------+----------------+-------------------+--------------------+---------+------------+
5 rows in set
Or something more precise:
SQL> SELECT ortschaftsname AS city, plz4 AS city_code, e AS lon, n AS lat
FROM wgs84 WHERE plz4 IN (8280, 4663, 6043);
+-------------+-----------+-------------------+--------------------+
| city | city_code | lon | lat |
+-------------+-----------+-------------------+--------------------+
| Aarburg | 4663 | 7.904271716719409 | 47.321443418782955 |
| Aarburg | 4663 | 7.889249714098425 | 47.313536073562474 |
| Aarburg | 4663 | 7.880309179095798 | 47.31255194439023 |
| Adligenswil | 6043 | 8.364849060491428 | 47.07037816052481 |
| Kreuzlingen | 8280 | 9.173740257895282 | 47.64491046067056 |
| Kreuzlingen | 8280 | 9.159171428030783 | 47.654149879509134 |
| Kreuzlingen | 8280 | 9.204470741840725 | 47.639949130372145 |
+-------------+-----------+-------------------+--------------------+
7 rows in set (0.003 sec)
I will leave it to the reader to clean out the duplicates… :-)
So far so good, now to the finer points::
Differences between MariaDB and MySQL
The procedure described above works perfectly with MariaDB 11.4 and 11.8. There are small differences with MySQL 8.4:
The first error message that prevents loading is this one:
ERROR 3948 (42000): Loading local data is disabled; this must be enabled on both the client and server sides
It can be bypassed relatively easily with the command:
SQL> SET GLOBAL local_infile = ON;
The next attempt will fail as follows:
ERROR 2068 (HY000): LOAD DATA LOCAL INFILE file request rejected due to restrictions on access.
This problem can be solved by starting the MySQL client as follows:
$ mysql --local-infile=1 --user=root test
Sources
- MariaDB: LOAD DATA INFILE
- MySQL: LOAD DATA Statement
Loading the data with PostgreSQL
PostgreSQL has the command COPY ... FROM:
postgres=# DROP TABLE IF EXISTS wgs84;
postgres=# CREATE TABLE wgs84 (
ortschaftsname VARCHAR(32)
, plz4 SMALLINT
, zusatzziffer SMALLINT
, zip_id INT
, gemeindename VARCHAR(32)
, bfs_nr SMALLINT
, kantonskuerzel CHAR(2)
, adressenanteil varchar(8)
, e DOUBLE PRECISION
, n DOUBLE PRECISION
, sprache VARCHAR(8)
, validity VARCHAR(12)
);
postgres=# -- TRUNCATE TABLE wgs84;
postgres=# COPY wgs84
FROM '/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv'
DELIMITER ';'
CSV HEADER
;
COPY 5713
Here, too, we receive the result as expected in the usual PostgreSQL form:
postgres=# SELECT ortschaftsname AS city, plz4 AS city_code, e AS lon, n AS lat
FROM wgs84 WHERE plz4 IN (8280, 4663, 6043);
city | city_code | lon | lat
-------------+-----------+-------------------+--------------------
Aarburg | 4663 | 7.904271716719409 | 47.321443418782955
Aarburg | 4663 | 7.889249714098425 | 47.313536073562474
Aarburg | 4663 | 7.880309179095798 | 47.31255194439023
Adligenswil | 6043 | 8.36487538940682 | 47.07037794822416
Kreuzlingen | 8280 | 9.173740257895282 | 47.64491046067056
Kreuzlingen | 8280 | 9.159171428030783 | 47.654149879509134
Kreuzlingen | 8280 | 9.204470741840725 | 47.639949130372145
(7 rows)
Sources
- PostgreSQL: COPY
Small differences between MariaDB/MySQL and PostgreSQL
Basically, the load command is completely different in the two database worlds.
With MariaDB and PostgreSQL, the commands run “out-of-the-box”. MySQL has two additional security hurdles built in here.
PostgreSQL does not recognise UNSIGNED integer data types, so the next largest data type (INT) must be used, which is a little less space-saving than with MariaDB/MySQL.
Remarks
When we did the same test a few days ago, there was still a loading error. So it seems that the data source has also changed slightly…
I have not found out quickly whether there is an SQL standard for these load commands and if so, whether MariaDB/MySQL or PostgreSQL are standard-compliant here.
And of course there are other ways to get your CSV data into the database…
The tools mariadb-import/mysqlimport are used if you want to do this from the command line. The CSV Storage Engine can also be misused for this purpose (see here for details). An officially supported variant is the MariaDB CONNECT Storage Engine with the CSV type (see here):
SQL> INSTALL SONAME 'ha_connect';
SQL> CREATE TABLE wgs84_fdw
ENGINE = CONNECT
table_type = CSV
file_name='/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv'
header = 1
sep_char = ';'
quoted = 0;
SQL> INSERT INTO wgs84 SELECT * FROM wgs84_fdw;
Unfortunately, it looks like the CONNECT Storage Engine will no longer be supported by MariaDB! And the mydumper/myloader tool also seems to be able to handle CSV files.
And of course the whole thing can also be solved using applications…
With PostgreSQL there are the following options:
postgres=# \copy wgs84 FROM '/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv' DELIMITER ';' CSV HEADER
then from the shell:
$ psql --user=dba -c "\copy wgs84 FROM '/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv' DELIMITER ';' CSV HEADER"
And the variant via the Foreign Data Wrapper (FWD). But I have not tried this:
postgres=# CREATE EXTENSION postgres_fdw;
postgres=# CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (
datasource 'CSV:/tmp/AMTOVZ_CSV_WGS84/AMTOVZ_CSV_WGS84.csv',
format 'CSV'
)
;
postgres=# CREATE USER MAPPING FOR local_user
SERVER foreign_server
OPTIONS (user 'foreign_user', password 'password')
;
postgres=# CREATE FOREIGN TABLE foreign_table (
id integer NOT NULL,
data text
)
SERVER foreign_server
OPTIONS (schema_name 'some_schema', table_name 'some_table')
;
This page was translated using deepl.com.

