DELETE in batches

Published at 2018-11-26

Microsoft SQL Server supports a DELETE TOP (x) FROM table construct, which allows you to only delete at most x rows from a table. This is useful for deleting rows in batches, to keep lock times low and the transaction log small. Postgres does not support this construct, but we can work around this:

delete from table
where id in (
    select id
    from table
    where condition
    limit 1000
    order by id
)

The nice part about this that you can use a similar query for SQL Server: Replace limit 1000with top 1000, and put it after the select:

delete from table
where id in (
    select top 1000 id
    from table
    where condition
    order by id
)

Executing this query manually in SSMS gave the following results:

(1000 rows affected)
(1000 rows affected)
(1000 rows affected)
this goes on for a while...
(0 rows affected)
(1000 rows affected)
(1000 rows affected)

There were no inserts happening at that time. So how did that happen?

Most examples of transaction phenomena use multiple statements per transaction. There is of course no reason to assume that transaction phenomena require multiple statements to show up:

The DMBS can choose to execute the select and delete parts of the query sequentially: First, select all rows, then delete all rows.

What actually happened: A background job was executing the exact same query. The background job and my SSMS session selected the same 1000 ids, but only the background job succeeded in deleting them. My session couldn't (after all, they're already deleted!).

In this particular case no fix was neccessary: The deleting was a pure cleanup job, and no harm comes from failing to delete all rows. The background job runs on a regular basis, and it would finish the deletion the next time it runs.

If we had to fix it, we would put the deletion in a transaction with the serializable isolation level.