Storage Structure

We mentioned that the Postgresql architecture consists of 3 main components,
memory, process and storage (Database Cluster). We talked about memory and
processes. Now it’s time for storage.
While the term cluster is mostly used in high availability and disaster recovery solutions
such as active-active cluster, and active-passive cluster, it has a different usage in
Postgresql.

In Postgresql, a database Cluster means a single instance-managed database stack.
In terms of a file structure, it is the only folder where all data is stored. It is called data
directory or data area.

When the Postgresql database is installed from pre-existing packages with apt or yum:

Installing Postgresql software from source code

# Step 1 Install required packages
apt-get -y install build-essential gcc libreadline-dev zlib1g-dev

# Step 2 Create the user
useradd postgres

# Step 3 Create folders for the binary and grant the access
mkdir -p /pg/13.4
chown -R postgres:postgres /pg

# Step 4 Create directory for data
mkdir -p /pgdata/13.4
chown postgres:postgres -R /pgdata/13.4

# Step 5 Download source codes
wget https://ftp.postgresql.org/pub/source/v13.4/postgresql-13.4.tar.gz
tar xvfz postgresql-13.4.tar.gz

# Sent output to /tmp/config.txt to check the installation
options.
cd postgresql-13.4
./configure --help > /tmp/config.txt

# Step 6
./configure --prefix=/pg/13.4
make
make install

# Note: To receive traces from Dtrace and processes, configure
with the following parameters.
./configure --enable-debug --enable-dtrace --enable-cassert
prefix=/pgdata/13.4

# Step 7
cd contrib
make
make install
ls -l /pg/13.4
total 12
                             bin
                             include
                             lib
                             share

ls -l /pgdata/13.4
total 0

So far, only the software has been installed.
It can also be called “Software only” installation.
Binary and executable (executable) files are in /pg/13.4 directory.
The /pgdata/13.4 directory is currently empty.

Creating a database cluster with initdb

# Database cluster is created in the /pgdata/13.4 folder.
su - postgres

/pg/13.4/bin/initdb -D /pgdata/13.4 -E unicode

# Start the instance
/pg/13.4/bin/pg_ctl -D /pgdata/13.4 -l pg.log start

postgres@srv2:~$ sudo netstat -ntlp
tcp 0   0 0.0.0.0:5432   0.0.0.0:*   LISTEN   2450/postgres

You can create as many database clusters and run instances as you want with the
condition of running in different folders and different TCP ports. Here, hardware is the
limit.
The log file is specified with option “-l”. In the example, the logs will be written to
pg.log when the instance is starting.

postgres@srv2:~$ ls -l /pgdata/13.4/
base          pg_serial
global        pg_snapshots
log           pg_stat
pg_commit_ts  pg_stat_tmp
pg_dynshmem   pg_subtrans
pg_hba.conf   pg_tblspc
pg_ident.conf pg_twophase
pg.log        PG_VERSION
pg_logical    pg_wal
pg_multixact  pg_xact
pg_notify     postgresql.auto.conf

Installation was made from source code on the srv2 machine.
We decided on binary and database cluster indexes.
Postgresql binary ($PGHOME): /pg/13.4/
Database cluster : /pgdata/13.4/
Configuration files : /pgdata/13.4/

When the installation is done with apt, paths etc. It comes with default settings.
As an example, let’s look at the cluster directory on the depsql01 machine.

Database cluster : /var/lib/postgresql/13/main
Configuration files : /etc/postgresql/13/main


  1. In step 1, we see database object ids with oid2name.
  2. In step 2, we see the contents of the PGDATA directory.
  3. In step 3, we see the folders corresponding to the database object ids.

Cluster Contents


Files in the Cluster


Block Size

root@srv2:~# dumpe2fs /dev/sda1 | grep -i 'block size'

dumpe2fs 1.44.5 (15-Dec-2018)
Block size: 4096
root@srv2:~# echo A > test.txt
root@srv2:~# du -sh test.txt
4.0K test.txt

postgres=# show block_size;
block_size
------------
8192 -----> 8KB
8192 -----> 8KB

Operating system block size
Operating systems divide the disk into blocks in order to read and write data.
The block size is determined when formatting the disk. (1024 bytes (1K), 2K, 4K, 8K, 16K…)
As can be seen in the example, the block size of the operating system is 4KB.
Although one character (i.e., 1 byte) is entered in the test.txt file, it takes up 4KB of
space.

Database block size
It can be as much as or multiples of the operating system block size.
In the example, the block size of the operating system is 4KB, and the block size of the
database can be powers of 2, such as 4KB, 8KB, 16KB.
The default value of database block_size is 8KB.
It is possible to change it with the –with-blocksize parameter during compilation.
It is a performance-enhancing factor that the rows in the tables do not exceed the
database block size.
In general, small block size provides performance increase for OLTP systems, while large
block sizes in OLP, that is, reporting databases, have a positive effect on performance

CREATE TABLE ekip(isim varchar(30), sirket varchar(50), per_no serial);
INSERT INTO ekip(isim,sirket) VALUES('Mehmed','Dataera');

testdb=# select * from ekip ;

isim | sirket | per_no
--------+---------+--------
Mehmed | Dataera | 1
(1 row)

testdb=# select pg_relation_filepath('ekip') ;

pg_relation_filepath
----------------------
base/16384/16403 ---> file where the table ekip is
 written
(1 row)

testdb=# \! du -sh /pgdata/13.4/base/16384/16403

8.0K  /pgdata/13.4/base/16384/16403

We connect to the testdb database with the psql -d testdb command and
create a table named ‘ekip’.
We are entering 14 characters of data into this table.
We find the location of the ekip table in the operating system.
With the last command we ran, we see how much space this table occupies on the
operating system.
Although 14 characters or 14 bytes of data are entered, we see that it takes up 8KB or
8192 bytes.


The ekip table is written to the file named 16403. This file continues to grow as data is
entered.

POSTGRESQL CLUSTER


In the picture on the left, a database created by the users, together with the default
databases, is located in the default tablespace.
In the second picture, the databases created by the users are located in the tablespace
named erp_tbs created by the users.
If the tablespace is not specified when creating databases, it creates it in
pg_default, which is the default tablespace.
If the tablespace is not specified while creating table, index, etc. objects, it is created in
the tablespace where the database is located.
While logically the tablespace is selected, this tablespace physically writes to a disk that
we specify. The separation of the logical and physical structure provides administrative
flexibility. If for any reason we move the tablespace to another disk, the application or
data is not affected.

Kategori seçin...