Angga Putra

Software Engineer

A broad NOT IN filter

I had some free time in the sprint since all my planned tasks were already done, so I opened Grafana and started looking around.

One chart caught my attention. A few APIs were sitting near a 5 second p95. It was slow enough to be real, but still quiet enough that nobody would rush to it unless they were already looking.

What narrowed it down

The traces were straightforward. Most of the request time was just waiting on one database call.

For the slower requests, that query was taking close to 5 seconds by itself. Once I saw that, I stopped thinking about the service layer and started looking at the SQL.

This was the predicate behind it.

SELECT id, payment_status, created_at
FROM items
WHERE tenant_id = 42
  AND payment_status NOT IN ('settled', 'voided');

It seems looked fine. The table already had a composite index on (tenant_id, payment_status), and the same query in lower envs was not causing complaint.

Looking at the plan

That was the part that made production hits difference than lower envs.

The shape of the data was different enough that the bad path only really showed up under production-sized rows. Once I ran EXPLAIN ANALYZE, the reason got easier to defend.

The issue was not the syntax by itself. It was selectivity.

For the tenant I was checking, removing settled and voided still left a big part of the rows in play. MySQL could use the leading key on tenant_id, but the remaining filter was still broad enough that it had to walk far more index entries than I wanted.

-> Filter: (payment_status not in (...))
   -> Index lookup on items using idx_tenant_status (...)

That shape was the useful clue. The engine could find the tenant quickly, but the status predicate was not cutting down enough work after that.

Changing it

The statuses in this flow were controlled by the application, not typed freely by externals. That gave me room to rewrite the filter from the other direction.

SELECT id, payment_status, created_at
FROM items
WHERE tenant_id = 42
  AND payment_status IN ('pending', 'authorized', 'captured');

Instead of excluding values I did not want, it listed the states the endpoint actually cared about. That version was just narrower.

After that rewrite, EXPLAIN ANALYZE changed in a more useful way.

-> Index range scan on items using idx_tenant_status (...)
   -> Filter: (payment_status in (...))

The rows examined dropped hard. The query that had been taking around 5 seconds moved closer to 50 milliseconds.

The difference

Nothing about the feature changed. The API response stayed the same. The useful difference was only that the filter finally matched the shape of the data and the index more closely.

This was not really about some blanket rule that IN is better than NOT IN. It was just one case where a negative predicate still left too many rows in play, and MySQL had to pay for that. Once the filter became tighter, the query got out of the way again.