Postgresql Docker Image with Seeded Data
This post was originally published on my previous blog jer-k.github.io
Published on 2019-03-11
Recently, I decided that one of my goals for 2019 was to familiarize myself more with Docker. I've been exposed to using Docker for the past couple of years, but I don't use it on a day-to-day basis. Every once in a while, I would need to update a Dockerfile or a script and I would realize I needed to brush up on mostly everything because it had been so long since the last time I looked at anything Docker related. I decided I would just dive in and read a book to familiarize myself with any concepts I had glossed over before so I started reading Learn Docker – Fundamentals of Docker 18.x. It was during a tutorial where some seeded data was needed in a Postgresql database that I was had a bit of an aha moment. I can build images that have data in them already?! I thought to myself; this could actually really help out on local development if I had a copy of a production database.
I thought I would put together a quick little tutorial on how you can create a Postgresql Docker image with seeded data that anyone could use.
To start off, I created a new Rails application and generated a migration that created 100 users. You can find the code
for that application here (if you want to follow along using that database, simply replace instances of
my_database_name
in this article with postgres_data_development
). Once the users are in the database, pg_dump
can be used to create the file needed to seed the database in our image.
The -O -x
flags tell pg_dump
to have no owner and no privileges so that the data can be imported into a new database
without worrying about user accounts. You can see the generated .sql
file from my example project here.
Generating a .sql
file will work, but imagine a database much larger than the 100 users I created. A good alternative
would be to use gzip to compress the file and reduce the Docker image size.
With the compressed database, it’s time to start building the Dockerfile.
That’s it! As of writing the latest version of postgres 10 is 10.6-alpine
. Simply COPY
the compressed database
into the docker-entrypoint-initdb.d
directory and then the Postgresql base image understands to unzip and initialize
the database with the dump file. The only other thing needed is to set the environment variables so that there is a user
to access the database with.
Build the image using the -t
flag to name it so that it can be referenced it when running a container.
Now run the image using the -d
flag to run it in detached mode. The last argument postgres
is the command to
start the database.
To ensure everything worked properly, there should be 100 users in the database when queried.
Success! We created a Docker image with seeded data that anyone could use. Also, don't forget to stop the container that was started earlier!