Wire-Server 5.24.0 release¶
Reference based on these build.json charts, and the release changelog.
Artifact: wire-server-deploy-static-f88a2db81e763f7376fc0f7ecc40166a3bc37ee8.tgz
Heads up¶
5.24 is one of the heavier upgrades in this series. Two big things going on:
databases-ephemeralis replaced byredis-ephemeral. The hostnamegundeckuses for Redis changes.- Conversation data moves from Cassandra to PostgreSQL. The original docs called this optional. It must be treated as mandatory. Skipping it here and then upgrading to
5.25causes conversations to silently disappear from reads, because of a bad chart default at5.25. See the post-upgrade migrations section below.
This page assumes the source version is 5.23.x.
Known bugs¶
brig won't deploy in non-federated environments¶
The 5.24 brig chart has a bug that breaks deployment when federation is off. Workaround, in values/wire-server/values.yaml:
This is fixed at 5.25, so once 5.25 is reached the workaround can be taken back out.
What must change¶
Listed in the order things should be done.
1. Deploy redis-ephemeral (replaces databases-ephemeral)¶
The upstream chart for the in-cluster Redis was swapped. The new chart is redis-ephemeral and ships Redis 7.4.6 (the old chart was based on Bitnami). It only supports standalone deployments.
Deploy it before the wire-server upgrade. If wire-server is upgraded first with gundeck.config.redis.host already pointing at redis-ephemeral, gundeck won't be able to connect.
Defaults are fine. Nothing should be carried over from databases-ephemeral, it's a different chart.
The old databases-ephemeral chart isn't auto-removed. Don't uninstall it yet either, wait until everything else is verified. See "Cleanup" at the bottom.
2. Edit brig.config.rabbitmq in the wire-server values¶
In values/wire-server/values.yaml:
In values/wire-server/secrets.yaml:
port defaults to 5672, only set it if the RabbitMQ instance listens somewhere else.
Why: for those using our ansible infrastructure package, starting at 5.23, rabbitmq is deployed as an external service instead of in-cluster. brig can't rely on the in-cluster default service name anymore, so the hostname has to be set explicitly.
3. Add the new background-worker config¶
background-worker needs a few new fields. In values/wire-server/values.yaml:
In values/wire-server/secrets.yaml:
To grab the existing PostgreSQL password from a typical deploy:
Why: background-worker now runs jobs that need PostgreSQL access and that talk directly to the Cassandra keyspaces of brig and galley. The federation domain is needed for federation-related background tasks.
Warning about the
postgresMigration.conversationdefault. At this release thebackground-workerchart defaultspostgresMigration.conversationtopostgresql. That default must not be left in place when conversations haven't been migrated yet. If they have not yet been migrated, the data will still be in Cassandra, so apostgresqlsetting points the worker at an empty table. See the migration section below. This is the bug that may have caused conversations to disappear at5.25for installs that didn't migrate.
4. Update gundeck.config.redis.host¶
For those using the in-cluster Redis service shipped with our bundle, The Redis service hostname has changed from {{ .Release.Name }}-master to {{ .Release.Name }}. With the standard redis-ephemeral release name, the in-cluster service is just redis-ephemeral now (used to be databases-ephemeral-redis-ephemeral-master).
Check the cluster:
The output looks something like this (the old databases-ephemeral-* services will still be there until the old chart is uninstalled, see "Cleanup"):
The one to use is the plain redis-ephemeral. In values/wire-server/values.yaml:
Bug in
wire-server-deploy5.24: the bundledvalues/wire-server/prod-values.example.yamlshipsgundeck.config.redis.host: databases-ephemeral-redis-ephemeral(the old service). If that file was used as a starting point, override it toredis-ephemeralin the localvalues.yaml.
5. Run the wire-server helm upgrade¶
Once all the values are in place:
Watch in another terminal:
Post-upgrade: migrate conversation data to PostgreSQL¶
Back up before starting. Take a backup of the Cassandra
galleykeyspace and of the target PostgreSQL database before running any of the steps below. The migration is destructive in the sense that data starts being written to PostgreSQL from step 1 onwards, and rolling back without a backup is not straightforward.
It can only be done after the 5.24 wire-server helm upgrade has succeeded. Some required services don't exist yet on 5.23, so trying to migrate before the upgrade just fails.
The migration runs in three steps. Each step is a values change followed by a helm upgrade --install wire-server ....
Step 1: prepare wire-server for migration¶
In values/wire-server/values.yaml:
Then run:
Once it's set to migration-to-postgresql, do not switch back to cassandra. New conversations from this point on are written to PostgreSQL, reads still come from Cassandra.
Step 2: run the actual migration¶
In values/wire-server/values.yaml:
Then run:
The background-worker pods restart and start moving data. This can take a long time on a database with a lot of conversations.
Watch the logs (look for finished migration):
Or watch the metrics, both of these should hit 1.0:
wire_local_convs_migration_finishedwire_user_remote_convs_migration_finished
Step 3: switch reads over to PostgreSQL¶
Once the metrics are at 1.0, in values/wire-server/values.yaml:
Then run:
From this point forward reads and writes both go to PostgreSQL. This configuration must be kept on every subsequent upgrade.
Optional changes¶
background-worker PostGreSQL tunables¶
These are chart defaults, only set them when an actual change is needed:
gundeck Redis tunables¶
Also chart defaults, don't touch unless necessary:
Verification¶
After the helm upgrade is done.
brig should connect to RabbitMQ. The log lines look something like this:
background-worker does the same and also opens a Cassandra control connection:
gundeck pods should all be Running:
If the conversation migration is already done, the row count in PostgreSQL should be non-zero. SSH into one of the postgres nodes:
And before calling it done, log in on the webapp and on a mobile client, send a message in an existing conversation, confirm conversations are still there. The whole point of the migration is to keep that data, so it must be verified for real.
Cleanup¶
Once redis-ephemeral looks healthy, the old databases-ephemeral chart can go. Nothing uninstalls it automatically.
First confirm nothing still references it:
The grep on values/wire-server/values.yaml should come back empty. If it doesn't, fix the override first, then come back.
Then:
Disk space note¶
For those of you using our ansible based deployment package, each upgrade in this series re-runs setup-offline-sources, which copies the new release's binaries, container images, and debs into /opt/assets on the assethost. After a few versions, the assethost runs out of space and the playbook fails with no space left on device.
When that happens, SSH into the assethost (not the adminhost) and clear it:
Then re-run setup-offline-sources from the adminhost.