This post is a long-time coming, and it may be too late, but I wanted to note what I did in case it's useful for others. It describes how I prepared my use of matrix-synapse for the deprecations, first of earlier versions of python, and then of PostgreSQL 9.5 and 9.6.
Prologue (TL;DR)
The quick and the short of the actions are:
Create a temporary matrix-synapse docker container to generate a set of clean configuration files:
docker run -it --rm \ -v <empty-directory-on-docker-host>:/data \ -e SYNAPSE_SERVER_NAME=<server-name> \ -e SYNAPSE_REPORT_STATS=no \ matrixdotorg/synapse:v${SYNAPSE_VERSION} generate
- Merge those config files with the config files from your old
matrix-synapse server and place them into an empty directory
to be mounted as the
/data/
directory in the container. Copy the media files over from the old location.
time sudo rsync -avz <old-media-store> <new-media-store> time sudo chown -Rv 991:991 <new-media-store>
Keep doing this for as long as the old service is running to make sure
<new-media-store>
is fully sync'd.- Stop the old matrix-synapse service
rsync
andchown
one last time.Create your new matrix-synapse container:
docker run -d \ -v <data-directory-on-docker-host>:/data \ matrixdotorg/synapse:${SYNAPSE_VERSION}
Add mounts, container name and container network settings to taste. This container would be pointing to the old PostgreSQL database.
Create a temporary PostgreSQL container to generate a set of configuration files:
docker run -d -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} \ -v <postgres-data-directory>:/var/lib/postgresql/data \ --name <postgres-container-name> \ postgres:${POSTGRES_VERSION}
- Copy the config files out of the container's
/var/lib/postgresql/data/
directory. - Merge in the settings from the old instance into these new files.
Stop the matrix-synapse container:
docker stop matrix-synapse
Dump the data from the old database:
sudo -u postgres pg_dumpall > synapse.dump
Copy the dump file into the new container:
docker cp synapse.dump <postgres-container-name>:/tmp
Load the data into the new database:
docker exec -it postgres-matrix-synapse bash -l psql -U postgres < /tmp/synapse.dump # to be run at the container prompt
Stop your new PostgreSQL container:
docker stop <postgres-container-name>
- Copy the new configuration files into the data directory (tricky, check below if you're struggling with this one).
Start the container afresh:
docker rm <postgres-container-name> docker run -d -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} \ -v <postgres-data-directory>:/var/lib/postgresql/data \ --name <postgres-container-name> \ postgres:${POSTGRES_VERSION}
- Repoint your matrix-synapse container's
homeserver.yaml
to the new PostgreSQL container. - Start your matrix-synapse container.
The Full Story
I run a matrix service using the software matrix-synapse and available at matrix.gibiris.org. When I first set it up in early 2019, it was on a VM running Debian Stretch (Debian 9). I used the Debian packages made available by the matrix-synapse project, and the initial setup used the SQLite database option.
After some time, when the SQLite database grew to about 1GiB, I followed the published instructions to migrate the data to PostgreSQL. Following my normal policy, I installed the version of PostgreSQL that was standard with Debian Stretch, version 9.6.
Towards the end of 2019, we were informed of the new matrix-synapse platform dependency deprecation policy, which was: as the dependent versions fall out of support by their own projects (e.g. Debian, Python, PostgreSQL, etc.), matrix-synapse will stop support for them, too.
And the first victims to be python 3.5, and, because it didn't come with a later version of python, Debian Stretch. Not long after that, PostgreSQL 9.6 was going to be for the chop, too.
I considered my options. I could have configured unofficial repos to get the later versions of the software onto Debian Stretch, or created a new VM with a more recent OS, etc.. For various reasons, I decided I would containerise the lot. I had already containerised a couple of other services (commafeed and gitea), so I was growing familiar with the process and the benefits. For me, the benefit for containerising matrix-synapse was that I would not have to worry about the OS or python versions again, as these were going to be built into the container itself. I knew PostgreSQL was going to be a different matter, but the logic applied to it, too – it was going to be much easier to manage once I had that containerised.
Having made that decision, I had a new problem: there were no easily accessible instructions available for my specific use-cases: to migrate matrix-synapse from a packaged, OS installation to a docker container, and to upgrade and migrate PostgreSQL from a packaged, OS installation to a docker container.
The following is to correct that gap.
Some items to note that are important:
- These are based on my experience, and cover only my specific use-cases. I hope this helps in your efforts, but there's a chance that a subtle or not-so-subtle difference between my setups (original and target) and yours will make things a little more challenging for you. I can help if I know the answer to a question you may have, but I can't guarantee I will know the answer.
- I use docker, but I don't (yet) use docker compose. There are benefits to doing so, and when I get the head-space to look into it, I will migrate over to that. Therefore, what I describe below uses plain old docker commands. It still works.
- Initially, I told myself that the version of PostgreSQL I was using was 9.5, support for which was to be deprecated in late 2020/early 2021. I got myself into a panic trying to do it all at once, and when I then realised I didn't need to worry about it until later in 2021, I kinda left it to the last minute, so there's nearly 12 months difference between the matrix-synapse move and the PostgreSQL move.
🔗 Migrating matrix-synapse to a docker container
The following assumes you're migrating your OS-package-installed
matrix-synapse from ${SERVER_A}
to a docker container running
on ${SERVER_B}
(i.e. docker commands – e.g. docker run ...
,
etc. – will be run on ${SERVER_B}
).
Pre-migration analysis
The first thing you need to do is to find some information about
your matrix-synapse setup on ${SERVER_A}
. For me, the
following was the case:
- The matrix-synapse configuration files were located in
/etc/matrix-synapse/
on${SERVER_A}
. The main configuration file was in this directory, and had the standard name ofhomeserver.yaml
. Other configuration files of interest to me werelog.yaml
andhomeserver.signing.key
. - The media files were stored in a location is specified in
homeserver.yaml
under themedia_store_path
setting. For me it was/var/lib/matrix-synapse/media
on${SERVER_A}
.
Next step was to make some decisions about the docker container.
- The first was the version of the docker container to use. I
decided it was safer to create a version of the
matrix-synapse container that was the same as the version I
was running on
${SERVER_A}
. The concern was that I wanted to migrate to and from the same version so that I could eliminate version differences as the reason for any issues that might arise, and so that I could move back to the old from the new if an issue arose without worrying about the state of the database. For the sake of this report, we'll say it was${SYNAPSE_VERSION}
. - I decided to name the container
matrix-synapse
. - Next, I decided not to have it running in the default docker
(overlay) network, but in a specific one named for it. I
decided to call this network
matrix-synapse
as well. - Next, the files that will need to survive updates and so on. I
decided that I will use virtual mounts for the docker
container, mounting just one volume to contain all the
files. The volume was going to be called
/data/
in the container, and I was going to put all the files into${HOME}/matrix-synapse/synapse-data/
on${SERVER_B}
and mount it to/data/
.homeserver.yaml
was going to be put into that directory, with the following set within the file (some of these will be defaulted in the initial docker container config generation step below):pid_file: /data/homeserver.pid
log_config: "/data/matrix.gibiris.org.log.config"
The config file on
${SERVER_A}
was namedlog.yaml
. When I was experimenting with what to do, I decided that this would be the better name.media_store_path: "/data/media_store"
signing_key_path: "/data/matrix.gibiris.org.signing.key"
Again, on
${SERVER_A}
this was calledhomeserver.signing.key
, but the change of name was in line with my decisions.
Migration actions
- The first thing I did was to create the matrix-synapse
docker container to see what happens. I essentially followed
the instructions at
https://hub.docker.com/r/matrixdotorg/synapse to fire it up
on
${SERVER_B}
. At this point, I hadn't set anything up here, so it was going to be created as a brand new, essentially useless matrix-synapse service. Once I had a proper look around, I removed that docker container …
docker stop <container-name> docker rm <container-name>
… and followed the instructions for generating a new set of config files …
docker run -it --rm \ -v ${HOME}/matrix-synapse/synapse-data:/data \ -e SYNAPSE_SERVER_NAME=matrix.gibiris.org \ -e SYNAPSE_REPORT_STATS=no \ matrixdotorg/synapse:v${SYNAPSE_VERSION} generate
- This gave me a set of configuration files in
${HOME}/matrix-synapse/synapse-data/
. I copied the
homeserver.signing.key
file from${SERVER_A}
into${HOME}/matrix-synapse/synapse-data/
on${SERVER_B}
, renamed it tomatrix.gibiris.org.signing.key
, removing the auto-generated version altogether. Set the ownership to991:991
and permissions to600
to protect this file:sudo chown -v 991:991 ${HOME}/matrix-synapse/synapse-data/matrix.gibiris.org.signing.key sudo chmod -v 600 ${HOME}/matrix-synapse/synapse-data/matrix.gibiris.org.signing.key
- I copied the
log.yaml
file from${SERVER_A}
into${HOME}/matrix-synapse/synapse-data/
on${SERVER_B}
, renamed it tomatrix.gibiris.org.log.config
, removing the auto-generated version altogether. It was necessary to review the settings in this file to make sure that the file paths were correct. I updated them to/data/logs/
, which would have sent the log files to${HOME}/matrix-synapse/synapse-data/logs/
on${SERVER_B}
. I applied the settings from the old
homeserver.yaml
on${SERVER_A}
into the new one on${SERVER_B}
, taking care not to change the file except where I needed it to be.As I was not moving the database at this point, I had to change the host of the database server from
localhost
to the hostname of${SERVER_A}
. Before proceeding, I used thepsql
command on${SERVER_B}
to make sure I could connect to the database and perform some simpleSELECT
statements.psql -h ${SERVER_A} -U <synapse-database-user> <synapse-database>
The previous steps can be done whenever, and this next step should be repeated until you're ready to finish the migration. It copies the media files from
${SERVER_A}
into${HOME}/matrix-synapse/synapse-data/media_store/
on${SERVER_B}
.time sudo rsync -avz <user>@${SERVER_A}:/var/lib/matrix-synapse/media ${HOME}/matrix-synapse/synapse-data/media_store time sudo chown -Rv 991:991 ${HOME}/matrix-synapse/synapse-data/media_store
I used the
time
command to get a report each time on how long the process took for planning purposes. This needs to be done over and over again until it's down to the shortest amount of copying, as you're doing this while your matrix-synapse is running on${SERVER_A}
and you want to make sure each time you catch the files that had been created since the last. This will then give you a very short down time when you get to the actual migration.The actual migration. Shutdown the matrix-synapse service on
${SERVER_A}
1:sudo systemctl stop matrix-synapse
- Repeat step 7) above one last time to get the last bits.
Remove any remnants of the docker container, create a new docker overlay network, and start a new container on it:
docker rm matrix-synapse docker network create matrix-synapse docker run -d \ -v ${HOME}/matrix-synapse/synapse-data:/data \ --name matrix-synapse \ --network matrix-synapse \ matrixdotorg/synapse:${SYNAPSE_VERSION}
Use some of the following commands to see if things are OK:
docker container ls -a
- You should see your container
listed in the output. If it says that it's
starting
, then run the command every few seconds until it changes to something else. If it saysunhealthy
, there's likely a problem that you will need to look into. If it sayshealthy
, then you're sorted. docker logs matrix-synapse
- This will output the logs of the docker container. It may not be especially helpful.
sudo tail -f ${HOME}/synapse-data/logs/<specified-log-name>
- This
will be the log file name you've specified in
matrix.gibiris.org.log.config
. You should see messages here about what might be wrong.
Typical things to look out for:
- Permissions on config files – they need to be readable to
user
991
. - Permissions on
logs/
andmedia_store/
directories. The need to berwx
for user991
or group991
. - Database connectivity – check all the settings. There may be firewall settings getting in the way here, too, but I would not be able to guess them from here (and now).
- If all is OK, you're done, but need to clean up.
Post-migration actions
If your old and new matrix-synapse services are behind a
reverse proxy, you will need to repoint the proxy config to the
new service. In doing so, you will need to first confirm that
the docker container can be seen from the proxy service
server. The ping
command should suffice here, if you can use
it.
Although you shut down matrix-synapse on ${SERVER_A}
, it
will start up again when you reboot that server, so you should
disable it completely:
sudo systemctl disable matrix-synapse
Once you've confirmed you have a working matrix-synapse docker
container, you should not attempt to revert to ${SERVER_A}
, so
you're best off to remove the package altogether:
sudo apt-get remove matrix-synapse
🔗 Migrating and upgrading PostgreSQL to a docker container
Pre-migration analysis
The first big question was to wonder what version of PostgreSQL to use. Applying the same logic as with matrix-synapse, I was concerned not to use the same version as I already had running as support for it in matrix-synapse was on the verge of being withdrawn, but that left me with a problem. I could not find information on how to migrate a PostgreSQL database of one version, run using an OS-installed package to a later version running in docker.
Eventually, I found a number of web sites that discussed how to migrate data from one database version to another, and the clear inferences for me were as follows:
- It's not possible or safe to just upgrade the database software in situ.
- Moving to another database, even one of a higher version, was
just a case of a
dump
from one, using it as input to the other.
The next issue was how to use the dump
as the input to the new
database. I couldn't understand the instructions published by
the PostgreSQL community on how to initiate the DB on first
run, so I had to plan to import the data into an already
initialised database.
I also needed to consider what version of PostgreSQL I was
going to use. When I was looking into this first, the latest
version was 13.x, and I saw no reason to go with an earlier
version2. Since then 14.x has been released,
but as version 13.x is sufficiently recent for me not to be
concerned. Let's stick with convention and call this
${POSTGRES_VERSION}
.
The rest was easy:
- I was going to maintain the configuration files in one
directory on
${SERVER_B}
,${HOME}/matrix-synapse/postgres-config/
, and the data files in another directory,${HOME}/matrix-synapse/postgres-data/
. - The container would be named
postgres-matrix-synapse
and it would be part of the same overlay network as thematrix-synapse
container.
Migration actions
- I ran some test containers and poked around in them for a bit
to see how it worked. I noticed that the container assumed
that the configuration files would reside in the internal
location
/var/lib/postgresql/data/
. This is also where the data are to be located. Upon first run, when it sees no data files, the container will initialise a new database. However, it will only attempt this if this/var/lib/postgresql/data/
directory is completely empty. This means that if you pass your prepared config files as mounts to thedocker run
command, it will fail to initialise the new database, reporting an error. Therefore, in order to get the container to start up from scratch and to initialise a new database, it was easiest not to pass any other mounts to the initial invocation of the container except for the empty/var/lib/postgresql/data/
directory. Then you have a database that is ready to consume a dump file. I then created a simple PostgreSQL container on
${SERVER_B}
and copied out the configuration files I needed. If you don't specify a set of configuration files atdocker run
time, the container will create them itself, and in this way you know that they are structured correctly for the version you want to use (The container needs to know what the password of thepostgres
user is to be. Thedocker run
command picks that up from the local environment.):docker run -d -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} \ --name postgres-test-container \ postgres:${POSTGRES_VERSION}
After the new database has been initialised (look at
docker logs postgres-test-container
and for when it tells you the container is ready to take connections), you can proceed to take copies of the default configuration files.for conffile in pg_hba.conf postgresql.conf do docker exec postgres-test-container \ cat /var/lib/postgresql/data/${conffile} > \ ${HOME}/tmp/postgres-config/${conffile} done docker stop postgres-test-container docker rm postgres-test-container
- I then took a copy of the configuration files from the
instance on
${SERVER_A}
and merged the settings into the new configuration files, taking care not to change the files where it didn't matter and to translate the settings for the new location decisions I made. I kept these new files aside so as not to interfere with the next step. I started my new PostgreSQL docker container, which will become the new matrix-synapse database:
docker run -d -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} \ -v ${HOME}/matrix-synapse/postgres-data/data:/var/lib/postgresql/data \ --name postgres-matrix-synapse \ --network matrix-synapse \ postgres:${POSTGRES_VERSION}
… and again, when
docker logs postgres-matrix-synapse
told me the database was ready for connections, I moved on to the next step.- Now to get a copy of the operating database.
This requires the matrix-synapse service to be shut down on
${SERVER_B}
. If your service is behind a reverse proxy, you should shut it down or put it into "maintenance mode" to manage requests. Also, if you are running any matrix applications (bridges, etc.), shut them down, too, to protect them during this phase.docker stop matrix-synapse
On
${SERVER_A}
, you now need to take a copy of PostgreSQL database:time sudo -u postgres pg_dumpall > /tmp/synapse.dump
Copy the file over from
${SERVER_A}
:scp <user>@${SERVER_A}:/tmp/synapse.dump ${HOME}/tmp
I now copy the file into the docker container and I apply it to the new database:
docker cp ${HOME}/tmp/synapse.dump postgres-matrix-synapse:/tmp docker exec -it postgres-matrix-synapse bash -l
This brings up the container-internal bash prompt, and I use that to run the next command:
psql -U postgres < /tmp/synapse.dump
This will take some time, so go and make a coffee or put one of the kids to bed, depending on the time of day.
Once the database has been imported, you can now apply your new configuration files. First, stop the container:
docker stop postgres-matrix-synapse
As mentioned, I decided that I would maintain the config files in a different directory on
${SERVER_B}
than the data files. This allows me to maintain these files separately in a revision control system like git or subversion3. The only way I could figure out how to make these files available for the container was to mount them individually. The files I am maintaining separately are:postgresql.conf
- The main configuration file.
pg_hba.conf
- The access control configurations
pg_ident.conf
- I don't know what this one is for, but if I need it, I'm keeping it here.
ssl-cert.key
andssl-cert.pem
- Needed for TLS connections to the PostgreSQL server.
So the the new docker container needed a more detailed command to run. So, remove the container:
docker rm postgres-matrix-synapse
… and create a new one of the same name:
docker run -d -e POSTGRES_PASSWORD=${POSTGRES_PASSWORD} \ -v ${HOME}/matrix-synapse/postgres-data/data:/var/lib/postgresql/data \ -v ${HOME}/matrix-synapse/postgres-config/pg_hba.conf:/var/lib/postgresql/data/pg_hba.conf \ -v ${HOME}/matrix-synapse/postgres-config/pg_ident.conf:/var/lib/postgresql/data/pg_ident.conf \ -v ${HOME}/matrix-synapse/postgres-config/postgresql.conf:/var/lib/postgresql/data/postgresql.conf \ -v ${HOME}/matrix-synapse/postgres-config/ssl-cert.key:/var/lib/postgresql/data/ssl-cert.key \ -v ${HOME}/matrix-synapse/postgres-config/ssl-cert.pem:/var/lib/postgresql/data/ssl-cert.pem \ --name postgres-matrix-synapse \ --network matrix-synapse \ postgres:${POSTGRES_VERSION}
… again, waiting for
docker logs postgres-matrix-synapse
to confirm all it well and the service is awaiting connections.Now you need to repoint your matrix-synapse to the new database service. If you have left everything else about the PostgreSQL database the same (the database name, the database user, the password – all of which would have been retained if you had followed the process as I lay it out up to now), then all you need to change is the setting of the database's hostname. In
homeserver.yaml
, locate thedatabase
setting, and changeargs.host
to the name of the new containerpostgres-matrix-synapse
.Save and close.
Time to suck it and see. Restart your matrix-synapse service:
docker start matrix-synapse
Consult the logs to see that matrix-synapse connected to the database and all is well. I found
docker logs matrix-synapse
to have been useless here, whereas the matrix-synapse logs (specified in your log configuration file,${HOME}/synapse-data/logs/<specified-log-name>
for me).- Assuming all is OK, now is when you restart your applications (bridges, etc.) and your reverse proxy
Post-migration actions
Lastly, as you don't want your old database to interfere with anything, it's worth shutting it down and disabling is on
${SERVER_A}
:sudo systemctl stop postgresql sudo systemctl disable postgresql
- You could also remove PostgreSQL from
${SERVER_A}
altogether if you really have no further use for it.
Epilogue
I containerised matrix-synapse in early 2021 and PostgreSQL in early 2022. I don't yet know what "repeatable" process I will use to manage upgrades to PostgreSQL, but I do the following for matrix-synapse:
I generate a new set of config files with the new release:
sudo rm -vrf ${HOME}/tmp/tmp-synapse-data docker run -it --rm \ -v ${HOME}/tmp/tmp-synapse-data:/data \ -e SYNAPSE_SERVER_NAME=matrix.gibiris.org \ -e SYNAPSE_REPORT_STATS=no \ matrixdotorg/synapse:v${NEW_VERSION} generate
- I merge into my existing config files any changes that come with
the new release and then I stop, rm and re-create the
matrix-synapse
container with the new version of the docker image.
I have also implemented an instance of mautrix-whatsapp in that
time as well, so I now use the --ip www.xxx.yyy.zzz
flag in the
docker container creation commands for all of matrix-synapse,
mautrix-whatsapp and PostgreSQL to ensure that they all get
the same IP address within the matrix-synapse
overlay network
whenever I need to do admin on them.
I am also planning to implement Cactus Comments for this blog, which will be used for this web site (which means, if you can see an area on this page below where you can enter comments, I have succeeded!).
Footnotes:
If memory serves me correctly,
the package name and service name on Debian Stretch was
matrix-synapse-py3
, but I'm leaving that out in this
report. You should research and use the correct names, if in
doubt.
Some time ago I moved my nextcloud instance's mariadb database to a docker container, running version 10.6, but that wouldn't work because it uses a construct that is no longer supported in 10.6, and I have to stick with 10.5 until that's fixed in nextcloud.
Yes, subversion is real, and carries some features that git doesn't, so – as they say on twitter – don't '@' me!
You can comment on this post below, or on the matrix room here. If you want, you can "Log in" using your [matrix] ID.
All comments are subject to this site's comment policy.