Back to blog
FILE 0x6E·AUDITING 22 REFERENCES TO GSIS THAT DON'T EXIST

Auditing 22 references to GSIs that don't exist

April 24, 2026 · aws, dynamodb, tooling, debugging

A pre-release audit pass on a backend I run turned up 22 places where the code passed IndexName: foo-index to a DynamoDB query on a table where foo-index didn't exist. Most of them had been silently dormant because the offending code paths weren't called in normal usage. One of them was crashing a cron job that hadn't been invoked since being deployed.

What was happening

The pattern was always the same:

$result = $db->query('checkonmine_pings', [
    'IndexName' => 'user_id-index',
    'KeyConditionExpression' => 'user_id = :u',
    'ExpressionAttributeValues' => [':u' => ['S' => $userId]],
]);

The trap: many of those tables already had user_id as the primary HASH key. The query would have worked perfectly without the IndexName line — but with it, DynamoDB returns a ResourceNotFoundException because the named GSI doesn't exist.

Most of these were copy-pasted from one query that needed a real GSI on a different table. Nobody noticed because nobody ever exercised the code paths.

What I found

I wrote a tiny audit script that lists every IndexName reference in the PHP source and compares against the live GSIs on each table:

python audit_indexes.py
# rough sketch
import boto3, re, os

ddb = boto3.client('dynamodb', region_name='us-east-1')
live = {}
for t in ddb.list_tables()['TableNames']:
    desc = ddb.describe_table(TableName=t)['Table']
    live[t] = {g['IndexName'] for g in desc.get('GlobalSecondaryIndexes', [])}

usages = []  # (file, line, table, index_name)
for root, _, files in os.walk('public'):
    for f in files:
        if not f.endswith('.php'): continue
        # naive: walk source, find IndexName => '...' near nearest 'TableName' or table arg
        # in practice: regex per query() call site, follow the table arg

for file, line, table, idx in usages:
    if idx not in live.get(table, set()):
        print(f'{file}:{line} {table} -> {idx} MISSING')

Results:

Plus two missing tables that the code happily referenced as if they existed: created them via the CLI with the schemas the existing service module already expected.

The fix

Mostly mechanical. For each "GSI doesn't exist but table has the right HASH key" case, just delete the IndexName line:

// before
$result = $db->query('checkonmine_pings', [
    'IndexName' => 'user_id-index',
    'KeyConditionExpression' => 'user_id = :u',
    ...
]);

// after
$result = $db->query('checkonmine_pings', [
    'KeyConditionExpression' => 'user_id = :u',
    ...
]);

For the "GSI doesn't exist and access pattern is rare" cases, swap Query for Scan + FilterExpression. Not the right answer on a hot table, but for an admin error-log lookup that runs maybe once a week it's fine.

For the two missing tables, aws dynamodb create-table with BillingMode=PAY_PER_REQUEST and SSE on. No CloudFormation — this codebase doesn't manage tables that way and I wasn't going to introduce it for two tables.

What I'd do differently

The script that found these is now part of the repo as audit_indexes.py. Re-running it after any major code change takes about three seconds and catches the next bug like this before deploy.

The pattern of "I copy-pasted a working query and didn't question the IndexName line" is going to happen forever, on every team. The defense is automation, not vigilance. Anything you've ever found by reading the code, you should be able to re-find by running a script.