Data Migration

Many methods and/or applications can be used when moving data from postgresql
to postgresql, and from different databases to postgresql. The most important criteria
when making a decision are data size and downtime. An easy to implement but slow
method/application cannot be used in an environment with little downtime.
By doing test migrations, the options should be tested one by one, and the total time
should be determined by marking how long all the steps take.

pgdump

pg_dump is a handy terminal-based tool that imports/exports data and DDLs of
objects in script/text and archive formats.
The script/text format dump can be imported with psql and the archive format dump
can be imported with pg_restore.
It uses COPY to copy the data, but it can also be retrieved with an insert
clause with the “--inserts” parameter. It is slow compared to the insert copy
method. It can be used if it will be imported into a different database other than
Postgresql.

# Help output
pg_dump --help
# Detailed help output
man pg_dump
# Generates create scripts and then insert sentences to the
testdb.sql text file.
postgres@srv1:~$ pg_dump testdb > /backup/pgdump/testdb.sql
# It also compresses while dumping.
pg_dump testdb | gzip -c > /backup/pgdump/testdb.sql.gz

pg_dump can dump a single database. If you want to dump the entire cluster or global
objects, pg_dumpall should be used.

pg_dump takes the dump of the moment it was started. If the dump is taken at 03:00
at night and you have a problem at 17:00 and you need to restore from the dump, you
can restore to 03:00 at night.
So, there will be a 14-hour data loss. Although it is used for backup, pgdump is not
a backup tool. Recovery systems are dumped, but mostly the purpose is metadata
backup and archiving.

Depending on the IOPS capacity, for 500 GB and above performance problems may
occur.
Over the network, the dump can also be done to the remote machines.
With pg_dump, schemas and tables can be selected and imported.

Note: Attention should be paid to the size of the database from which the dump is
taken, and the capacity of the target directory/disk (df -h). If a dump is taken to the
same directory/disk as the database and the dump takes up all space on the disk, it will
stop the database.

Compressed Archive Format dump

pg_dump -h 10.71.6.22 -p 5434 -U mehmed -F c -b -v -f /../testdb.dmp testdb

From port 5434 of machine 10.71.6.22
Testdb database
Compressing (-F c)
In Verbose mode, (detailed logging)
Gets dump with user mehmed

Create Database Syntax

# -C create database syntaxını ekler.
pg_dump -h 10.71.6.22 -p 5434 -U mehmed -C -F p -b -v -f /
backup/pgdump/testdb.sql testdb

From port 5434 of machine 10.71.6.22
Including the testdb database ‘create database’ script (-C),
In text format (-F p)
In Verbose mode,
Gets dump with user mehmed.

pg_dumpall

#dump
pg_dumpall > dumpdosya_ismi.dmp
# Restore
psql -f dumpdosya_ismi.dmp postgres

pg_dump can export a single database, role, tablespace etc. It cannot dump the
cluster-wide objects.
pg_dumpall can export the entire cluster. All databases, roles and tablespaces.
Note: If there are tablespaces created other than default and they are in different
directories, these directories must be created beforehand and the privileges must be
granted.

List Dump Content

pg_restore --list testdb.dmp

Schema dump

# Sıkıştırılmış olarak erp_schema ve ik_schema larının dumpını
alır (-F c)
pg_dump -h 10.71.6.22 -p 5434 -U mehmed -F c -b -v -n erp_
schema -n ik_schema -f erp_ik.dmp testdb
# Text olarak erp_schema ve ik_schema larının dumpını alır dump
alır (-F p)
pg_dump -h 10.71.6.22 -p 5434 -U mehmed -F p -b -v -n erp_
schema -n ik_schema -f erp_ik.sql testdb

From port 5434 of machine 10.71.6.22
Dumps the erp_schema and ik_schema from the testdb database.

Paralelizm ( -Fd -j x )

pg_dump -h 10.71.6.22 -p 5434 -U mehmed -Fd -b -v -j 4 -f testdb.dir testdb

Parallelism is only available in directory mode. It creates a directory called testdb.dir
and writes the files into it.
The optimum parallelism level can be decided according to the number of CPU cores
and disk IOPS capacity.
While dumping, it can be monitored with top or htop, one of the Linux performance
tools.

Custom Format (-Fc)

pg_dump -h 10.71.6.22 -p 5434 -U mehmed -Fc -b -v -f testdb.dmp testdb

It can dump the entire database with custom format -Fc. In the target database, only
the desired object can be imported with pg_restore.

Metadata Dump

pg_dump -Fp --section pre-data --section post-data -f predata__
testdb.sql testdb
# ya da pg_dump -Fp --schema-only -f schema-only_testdb_nodata.sql
testdb
# Not: schema ve schema-only farklı parametrelerdir.
# “create database” syntaxı ile ( -C )
pg_dump -Fp --schema-only -C -f schema-only_testdb_nodata.sql
testdb

It exports only create scripts as text without data.

Split

pg_dump veritabanı_ismi | split -b 30G --filter='gzip > $FILE.gz'

Splits and compresses into 30 GB chunks. Particularly, it may be useful not to get stuck
up on operating system limits and to divide files into parts when sending files over long
distances.

pg_restore

It imports dump files taken in archive format with pg_dump.

Example 1: import/export in archive format

Scenario: Testdb is dumped in an archive format and imported into another database
named restorearsivtestdb.

Step 1

Testdb is dumped.

# The database named testdb is dumped in archive format.
cd /backup/pgdump/
pg_dump -Fc -b -v -f testdb.dmp testdb

Step 2

A new database named “restorearchivetestdb” is created to restore the dump
file.
To create the same database as the source;
-”create script” can be obtained with pgadmin.
-You can dump without data using the pg_dump with -C parameter.
(pg_dump -Fp --schema-only -C -f schema-only_testdb_nodata.sql testdb)

# schema-only_testdb_nodata.sql create database scripti alınır.
CREATE DATABASE testdb WITH TEMPLATE = template0 ENCODING =
‘UTF8’ LOCALE = 'en_US.UTF-8';
# Database ismi değiştirilir.
CREATE DATABASE restorearsivtestdb WITH TEMPLATE = template0
ENCODING = 'UTF8' LOCALE = 'en_US.UTF-8';
postgres@srv1: psql
CREATE DATABASE restorearsivtestdb WITH TEMPLATE = template0
ENCODING = 'UTF8' LOCALE = 'en_US.UTF-8';

Step 3

The restore process is started.

pg_restore --dbname=restorearsivtestdb --jobs=4 --verbose testdb.dmp
postgres@srv1:/backup/pgdump$ psql -d restorearsivtestdb
restorearsivtestdb=# \dt
List of relations
Schema | Name | Type | Owner
--------+------------------+-------+----------
.....
public | staff | table | postgres
public | store | table | postgres

or

psql --set ON_ERROR_STOP=on restorescripttestdb < testdb.sql

Example 2: export/import in script/text format

Scenario: Testdb is dumped in script/text format and imported into another database
named restorescripttestdb.

Step 1

Testdb is dumped in script/text format.

# The database named testdb is dumped in script/text format.
cd /backup/pgdump/
pg_dump -F p -b -v -f testdb.sql testdb

Step 2

A new database named “restorescripttestdb” is created to restore the dump file.

CREATE DATABASE restorescripttestdb WITH TEMPLATE = template0
ENCODING = 'UTF8' LOCALE = 'en_US.UTF-8';
CREATE DATABASE

Step 3

The restore process is started.

postgres@srv1:/backup/pgdump$ psql -d restorescripttestdb
restorescripttestdb=# \dt
Did not find any relations.
restorescripttestdb=# \i testdb.sql
restorescripttestdb=# \dt
List of relations
Schema | Name | Type | Owner
--------+------------------+-------+----------
.....
public | staff | table | postgres
public | store | table | postgres
# liste gelmezse
set search_path="$user", public ;

or

psql --set ON_ERROR_STOP=on restorescripttestdb < testdb.sql

ON_ERROR_STOP=on : End the dump when it gets an error
(pg_restore --exit-on-error) .

Example 3 schema export/import

Scenario: ik_schema in testdb database is cpied to ik_schematestdb database

# ik_schematestdb database is created.
CREATE DATABASE ik_schematestdb WITH TEMPLATE = template0
ENCODING = ‘UTF8’ LOCALE = ‘en_US.UTF-8’;
# ik_schema in testdb is copied to database ik_schematestdb.
pg_dump -d testdb -n ik_schema | psql -d ik_schematestdb

Note: If the data is large, network and IOPS performance values should be monitored.
If it is executed on the same machine, there may be a performance bottleneck as it will
both read and write.

Another method;

# Take dump of ik_schema in testdb.
pg_dump -F c -b -v -n ik_schema -f ik_schema.dmp testdb
# ik_schematestdb database is created.
CREATE DATABASE ik_schematestdb WITH TEMPLATE = template0
ENCODING = ‘UTF8’ LOCALE = ‘en_US.UTF-8’;
# The ik_schema is created in the ik_schematestdb database.
psql -d ik_schematestdb
CREATE SCHEMA IF NOT EXISTS ik_schema AUTHORIZATION ik_user;
# The ik_schema is imported into the ik_schema schema in the
ik_schematestdb database.
pg_restore -d ik_schematestdb -n ik_schema ik_schema.dmp

Example 4: Export/import of a single table

Scenario: test table is created in testdb database.
It is exported and imported into another database

# Create a table in the testdb database and enter data.
postgres@srv1:/backup/pgdump$ psql -d testdb
testdb=# CREATE TABLE birtablo (id serial, name text);
INSERT INTO birtablo (name) SELECT ‘Dataera’ FROM generate_series(1, 5);

# We get the dump of a table from the testdb database.
pg_dump -d testdb -t birtablo --schema=public > birtablo.sql
# We import the table to the tdb database.
postgres@srv1:/backup/pgdump$ psql -d tdb
tdb=# \i birtablo.sql
tdb=# select count(*) from birtablo ;
count
-------
5
# If no query results are returned, it may be necessary to set
the search_path.
tdb=# set search_path=”$user”, public ;

Another method

pg_dump -d testdb -v -Fc -t birtablo --schema=public > birtablo.dmp
pg_restore -d tdb -v < birtablo.dmp

Example 5: Importing a single table from a full dump file

Scenario: A full dump of the tdb database, which has 17 tables, is taken.
The table named actor is deleted.
Only this actor table is imported from the full dump.

# Tale dump of the entire database.
pg_dump -d tdb -v -Fc -f tdb.tar
# The actor table is dropped.
postgres@srv1:/backup/pgdump$ psql -d tdb
tdb=# drop table actor cascade ;
# Only the actor table is imported.
pg_restore -v -d tdb -t actor < tdb.tar