Migrate Galley Data from Cassandra to PostgreSQL¶
Use this procedure to migrate Galley-managed data from Cassandra to PostgreSQL. Currently, this migration is only required if you need channel search and channel management from Team Settings on releases that support PostgreSQL-backed conversation data.
The PostgreSQL tables used by these migrations, including collaborators, schema_migrations, user_group, and user_group_member, are defined in postgres-schema.sql and created during installation. They are empty by default until the matching migration is enabled and backfilled.
Feature Availability¶
| Feature | Available from |
|---|---|
conversation migration | 5.24.0 |
conversationCodes migration | 5.26.0 |
teamFeatures migration | 5.27.0 |
This guide covers these data categories:
- Conversations
- Conversation codes
- Team features
After the migration is complete, PostgreSQL becomes the authoritative store for the migrated domains.
Before You Start¶
Make sure all of the following are true before changing any migration settings:
- You are running a
wire-serverrelease that supports the domain you want to migrate. See Feature Availability. - PostgreSQL is deployed and reachable from the cluster. If you still need to set it up on your on-prem environment with our custom postgresql cluster, see PostgreSQL High Availability Cluster - Quick Setup.
galleyandbackground-workerboth have PostgreSQL host, database, user, and password configured.- The
cassandra-migrationsjob for your Wire upgrade has already completed successfully. - You have enough PostgreSQL connections available for the temporary migration workload.
The cassandra-migrations job only prepares schema and metadata. It does not copy conversation data from Cassandra into PostgreSQL. The data copy is performed by background-worker.
PostgreSQL Connection Budget¶
Before starting the migration, make sure you have enough connections available. The budget planning itself belongs in the dedicated guide, so keep this step as a pointer to the canonical reference.
See PostgreSQL Connection Budget for how to calculate your connection budget, default and low-traffic starting points, and how to tune from observed traffic.
Recommended Domain Order¶
Migrate domains in this order:
- Conversations
- Conversation codes
- Team features
This keeps the largest and most operationally sensitive migration first, when your rollback options are still best for the remaining domains.
Migration States¶
Each domain is controlled with postgresMigration.<domain> and can be in one of these states:
cassandra: reads and writes stay on Cassandramigration-to-postgresql: new writes go to both Cassandra and PostgreSQLpostgresql: reads and writes use PostgreSQL only
For each domain, the migration always follows the same sequence:
- Enable dual-write by setting
postgresMigration.<domain>: migration-to-postgresql. - Start the backfill by setting the matching
migrate*flag onbackground-worker. - Cut over by setting
postgresMigration.<domain>: postgresqland turning themigrate*flag off again.
Once a domain is moved to migration-to-postgresql, do not set it back to cassandra.
Important Rules¶
Keep galley and background-worker aligned¶
background-worker.config.postgresMigration.<domain> must always match galley.config.postgresMigration.<domain>.
Plan extra PostgreSQL capacity for the migration window¶
The steady-state pool size is often too small for the backfill step. If you see connection acquisition timeouts during migration, increase background-worker.config.postgresqlPool.size and acquisitionTimeout before retrying.
Migrate one domain at a time¶
Do not migrate conversations, conversation codes, and team features in the same deployment. Finish one domain completely before starting the next one.
Base Configuration¶
Start from a safe baseline where PostgreSQL is configured but Cassandra is still authoritative.
Deploy this first and verify both services are healthy.
Migration Procedure¶
Apply the following procedure to one domain at a time.
Step 1: Enable dual-write¶
Set the selected domain to migration-to-postgresql in both galley and background-worker.
Example for conversations:
After the rollout:
galleyshould restart cleanly.- New writes for that domain should be written to both Cassandra and PostgreSQL.
- No backfill should run yet.
Step 2: Start the backfill¶
Backfill means copying the existing data for the domain from Cassandra into PostgreSQL while dual-write mode is already enabled.
Enable the matching migration flag on background-worker.
Flags by domain:
- Conversations:
migrateConversations: true - Conversation codes:
migrateConversationCodes: true - Team features:
migrateTeamFeatures: true
Example for conversations:
migrateConversationsOptions is only used for conversation migration. Conversation codes and team features do not use this block.
Step 3: Monitor the migration¶
Use logs and Prometheus metrics to confirm progress.
Check background-worker logs:
Useful log patterns:
finished migrationerror occurredestimatedRows
Useful Prometheus metrics:
-
If you do not have Prometheus set up yet, use
kubectl logsand/or directly checkbackground-workerdatabase queries in Postgres withpg_stat_activityand relevant counter tables to confirm migration progress. -
Example from a background-worker pod (direct wire-server DB query, as in wire-utility-tool):
-
If you have the
wire-utility-toolhelper script on your admin host, use: -
For more PostgreSQL troubleshooting and
pg_stat_activityexamples, see Wire utility tool – PostgreSQL inspection. -
You can also collect one-shot Prometheus / metrics scraping from individual services using
/i/metrics(documented in administrate/users.md).Example for service pod port-forwarding:
-
Interpreting
pg_stat_activityfor this migration path:datname='wire-server'andusename='wire-server'are normal.application_nameshould bebackground-workerorgalleyfor migration progress; if it is empty andqueryis app-domain SQL (e.g.,SELECT ... FROM apps WHERE ...), that is normal application traffic.state='active'with longnow()-query_startmeans a query currently running;state='idle'means waiting on the client.wait_event_type/wait_eventshow lock wait if non-empty.querytext likeSELECT * FROM pg_stat_activity ...is your own monitoring query; ignore it for migration status.
-
Example migration-focused query:
| Metric | Meaning |
|---|---|
wire_local_convs_migration_finished | Local conversation migration is complete when the value is 1 |
wire_user_remote_convs_migration_finished | Remote conversation index migration is complete when the value is 1 |
wire_team_features_migration_finished | Team features migration is complete when the value is 1 |
wire_hasql_pool_ready_for_use | PostgreSQL pool is healthy when each pod reports ready connections |
wire_hasql_pool_session_failure_count | Should remain 0 |
There is no dedicated Prometheus completion metric for conversation codes. Validate that migration through logs.
Step 4: Cut over to PostgreSQL¶
When the migration has finished, set the selected domain to postgresql in both services and disable the matching migration flag.
Example for conversations:
After this rollout, the selected domain reads from PostgreSQL only.
Final Configuration¶
When all domains have been migrated, both services should point all supported Galley data to PostgreSQL.
Post-Migration Checks¶
After the last cutover:
- Confirm
galleyandbackground-workerpods are healthy. - Confirm
wire_hasql_pool_session_failure_countstays at0. - Confirm channel search and Team Settings channel management work as expected.
- Confirm no migration flags remain set to
true.
Troubleshooting¶
Migration does not start¶
Check the migration flag names carefully. For example, migrateConversations is correct, while migrateConversation is ignored.
Pods fail to start with a storage-location parse error¶
This usually means a postgresMigration value was written as a boolean instead of a string. Use only:
cassandramigration-to-postgresqlpostgresql
PostgreSQL acquisition timeouts appear during migration¶
Increase background-worker.config.postgresqlPool.size and acquisitionTimeout, then redeploy background-worker.
No PostgreSQL pool metrics appear for background-worker¶
background-worker may not emit wire_hasql_pool_* metrics until it has attempted to use PostgreSQL. This is expected before the migration flag is enabled.