Moritz Kraus
1 year
Stefan Xenopol
1 year
Andreas Vöst
1 year

FAQ for When PostgreSQL Indexes Are Corrupted After Locale Changes

Posted . Visible to the public.

When changing glibc versions, it's possible to end up with corrupt indexes in PostgreSQL.

Are My Indexes Affected?

If your database has been created on an operating system with glibc version < 2.28 and later upgraded to an operating system with glibc >= 2.28, you're most likely affected. To find out which indexes in our database are affected, run the following slightly modified query from the PostgreSQL documentation Show archive.org snapshot :

CREATE extension amcheck;
SELECT bt_index_check(index => c.oid, heapallindexed => i.indisunique),
               c.relname,
               c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;

This requires superuser permissions to install the amcheck extension to the server. If you're a makandra customer, we've already added this extension for you.

Note

The query will fail on the first corrupted index it finds. There will most likely be more.

How Do I Fix My Indexes?

You'll have to rebuild them. Be carefule with rebuilding since just running REINDEX INDEX indexname will write-lock the index' parent table. The index itself will, of course, be locked exlusively, prohibiting even reads.

Consider REINDEX CONCURRENTLY if you're running PostgreSQL >=12. If you're a makandra customer and unsure which version we're running for you, please ask ops@makandra.de.

If your PostgreSQL version is below 12, you'll have to drop the old index and create it anew, e.g. using CREATE INDEX CONCURRENTLY and then DROP INDEX the old one c.f. postgres docs Show archive.org snapshot .

Info

Keep in mind that re-creating the database index is an expensive operation and try to schedule it to low-traffic hours if at all possible to avoid application performance impacts.

Warning

If you index or reindex CONCURRENTLY, make sure to check that the operations succceeded. To do this, you can use \d my_table_name in the database console, it will show the index as INVALID if it could not be rebuilt.

Unique indexes

There is a chance that reindex fails on unique indexes if there a conflicting entries. In this case, you need to clean up the data manually. In this case you will get an error like

ERROR:  could not create unique index "index_sites_on_slug"
DETAIL:  Key (slug)=(foobar) is duplicated.

Finding the duplicated rows is a bit tricky (since the index is broken!), but you can force a full-table scan easily with something like

SELECT * FROM table WHERE slug LIKE '%foobar%';

Can Re-Creating the Index Cause Problems?

In most cases, reparing or re-creating the index will not cause problems. However, in some cases it's possible that the data returned by query that has been backed by a corrupted index would differ from a query that has been backed by a correct index.

It's possible that queries will return different results or the same results with different sorting after the index has been re-created. If the sorted sequence of query results is of importance in your application, it is very important to double-check results to be avoid unexpected consequences.

Can I Test Things Locally With a Database Dump?

Unfortunately not. Restoring from a logical database backup, i.e. an SQL dump, will not reproduce the error and is not suitable for testing changes, since importing dumps will re-create indexes in the first place.

How Do I Test All Indexes In All Databases?

If you have admin permissions on your database servers, you can use this script to check all databases on your server:

#!/usr/bin/env bash

# this script gets all databases of a database server
# for each database it creates the amcheck extension
# then it collects all oids for any btree index in that database
# then it will run a bt_index_check on any index oid
# then it will drop the amcheck extension again

list_index_sql="
SELECT c.oid
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
AND c.relpersistence != 't'
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;
"

databases=$(psql -tAc "SELECT datname FROM pg_database WHERE datistemplate = false;")

for database in $databases; do
  echo "Database: ${database}"

  psql -tA "${database}" -c "CREATE EXTENSION IF NOT EXISTS amcheck;"

  oids=$(psql -tA "${database}" -c "${list_index_sql}")

  for oid in $oids; do

    echo -n "."
    psql -tA "${database}" -c "SELECT bt_index_check(index => ${oid}, heapallindexed => true);" > /dev/null
  done
  echo
  psql -tA "${database}" -c "DROP EXTENSION IF EXISTS amcheck;"
done

For more details and more test possibilities, consult the amcheck documentation Show archive.org snapshot .

Florian Heinle
Last edit
Tobias Kraze
License
Source code in this card is licensed under the MIT License.