Exploring PostgreSQL with Docker

This is the second part of a series looking at how easy Docker makes it to explore and experiment with open source software. Last time we looked at Redis, and that gave us the opportunity to see the docker run and docker exec commands in action.

Today we're going to look at PostgreSQL which will give us an opportunity to see Docker volumes in action.

You can follow along with the commands in this tutorial if you have Docker installed. If you're running Docker for Windows put it in Linux mode. But another great option is Play with Docker which lets us run all these commands in the browser.

Start a new container running PostgreSQL

We'll use docker run to start a new container from the official postgres image with the name postgres1 and exposing port 5432 (the PostgreSQL default). We're running detached (-d) mode (so in the background).

But we're also going to mount a volume (with -v), which will be used to store the database we create. The volume name will be postgres-data, and Docker will automatically create it (just using storage on the Docker host's local disk) if a volume with this name doesn't already exist.

PostgreSQL stores its data in /var/lib/postgresql/data, so we're mounting our volume to that path.

docker run -d -p 5432:5432 -v postgres-data:/var/lib/postgresql/data `
           --name postgres1 postgres

Once we've done this we can check it's running with

docker ps

And view the log output with

docker logs postgres1

Create a database

We'll create a database and one easy way to do that is by using docker exec to launch an interactive shell running inside our postgres1 container, which has the PostgreSQL CLI tools installed. This saves us from needing to have any tools to connect to and manage PostgreSQL databases installed locally.

docker exec -it postgres1 sh

Inside that shell we can ask it to create a new database with the name mydb.

# createdb -U postgres mydb

And then let's launch the psql utility which is a CLI tool for PostgreSQL, connected to our mydb database:

# psql -U postgres mydb

Explore the database

Now inside psql, let's run some basic commands. \l lists the databases. We'll also ask for the database version, and the current date:

mydb=# \l
mydb=# select version();
mydb=# select current_date;

Now let's do something a bit more interesting. We'll create a table:

mydb=# CREATE TABLE people (id int, name varchar(80));
CREATE TABLE

Then we'll insert a row into the table:

mydb=# INSERT INTO people (id,name) VALUES (1, 'Mark');
INSERT 0 1

And finally, check it's there

mydb=# SELECT * FROM people;
 id | name 
----+------
  1 | Mark
(1 row)

Now we can quit from psql with \q and exit from our shell

mydb=# \q 
# exit

Of course our postgres1 container is still running.

Stop and restart the container

Let's prove that we don't lose the data if we stop and restart the container.

docker stop postgres1
docker start postgres1

And rather than connect again to this container, let's test from another linked container, using the same technique for linking containers we saw in our Redis demo.

docker run -it --rm --link postgres1:pg --name client1 postgres sh

Launch psql but connect to the other container (-h) which we've given the name pg in our link configuration:

# psql -U postgres -h pg mydb

Now from this client1 container we can access data in the database stored in the postgres1 container:

mydb=# SELECT * FROM people;
 id | name 
----+------
  1 | Mark
(1 row)

Now we can quit from psql and exit from our shell, which will remove the client1 container since we specified the --rm flag to auto-delete the container when the command it was running exits.

mydb=# \q 
# exit

Inspect the volume

We can find out information about the volume that we've created with docker volume inspect, including where on our local disk the data in that volume is being stored. Here's some typical output.

$ docker volume inspect postgres-data
[
    {
        "CreatedAt": "2018-09-03T19:50:23Z",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/lib/docker/volumes/postgres-data/_data",
        "Name": "postgres-data",
        "Options": null,
        "Scope": "local"
    }
]

And if we take a look inside the local folder on our Docker host, we can see all the data that has been stored in that volume.

$ ls /var/lib/docker/volumes/postgres-data/_data/
PG_VERSION            pg_multixact          pg_tblspc
base                  pg_notify             pg_twophase
global                pg_replslot           pg_wal
pg_commit_ts          pg_serial             pg_xact
pg_dynshmem           pg_snapshots          postgresql.auto.conf
pg_hba.conf           pg_stat               postgresql.conf
pg_ident.conf         pg_stat_tmp           postmaster.opts
pg_logical            pg_subtrans           postmaster.pid

Obviously a Docker volume doesn't need to be stored on local disk on the Docker host. In a production environment like Azure, you'd most likely mount an Azure file share as a volume.

Discard the container but keep the data

Let's stop and remove the postgres1 container with a single command (-f forces it to remove a running container). Because the data is stored in a volume, that is still safe.

docker rm -f postgres1

Attach an existing volume to a new container

Let's now start up a brand new container called postgres2 but attach the existing postgres-data volume that contains our database:

docker run -d -p 5432:5432 -v postgres-data:/var/lib/postgresql/data --name postgres2 postgres

Once it starts up, let's run a psql session inside it and check that the database, table and data are still all present and correct:

docker exec -it postgres2 sh
# psql -U postgres mydb
mydb=# SELECT * FROM people;
 id | name
----+-------
  1 | Mark
(1 row)

And exit out again:

mydb=# \q
# exit

Clean up everything

And now, let's really clean up. Not only will we remove the postgres2 container, but we'll then remove the postgres-data volume. So now the contents of the database are deleted as well.

docker rm -f postgres2
docker volume rm postgres-data

Summary

As you can see, not only is it easy to use Docker to explore PostgreSQL, we can also easily configure a volume allowing the lifetime of the data to be managed independently of the lifetime of the container. If we'd wanted to, we could also have connected directly to this PostgreSQL container on port 5432 and used it for some local development.

Next up, we'll explore running Elasticsearch in a container, which will give us an opportunity to see docker-compose in action.

Comments

February 25. 2019 23:18

Excellent Docker PostgreSQL article, it is complete and exactly what I was looking for. Thank you for sharing, I wish all articles were this well written. I can now ditch my local install and use docker for doing dev work!

HappyHacker

December 24. 2020 10:39

Indeed excellent article!
I have just tried and everything works as prescribed!
The only thing which failed during creating postgre1 container was that the command
$ docker run -d -p 5432:5432 -v postgres-data:/var/lib/postgresql/data --name postgres1 postgres
did not start container right away.
The reason was
docker logs postgres1 Error: Database is uninitialized and superuser password is not specified. You must specify POSTGRES_PASSWORD to a non-empty value for the superuser. For example, "-e POSTGRES_PASSWORD=password" on "docker run". You may also use "POSTGRES_HOST_AUTH_METHOD=trust" to allow all connections without a password. This is *not* recommended. See PostgreSQL documentation about "trust": https://www.postgresql.org/docs/current/auth-trust.html
So the fix for this command was
docker run -e POSTGRES_HOST_AUTH_METHOD=trust -d -p 5432:5432 -v postgres-data:/var/lib/postgresql/data --name postgres1 postgres
and then everything worked flawlessly. Perhaps it depends on postgresql version
Thank you!

Oleksandr Kharchenko

December 24. 2020 10:47

thanks for sharing the solution to get this working

Mark Heath