READPAST your problems in SQL
Interesting SQL table hint I found today: READPAST. It instructs SQL queries to ignore locked rows. This comes with advantages and disadvantages. For one it avoids deadlocks when trying to read or write an already locked row, but it also provides the wrong results. Just as NOLOCK, it works around the transaction mechanism, and while NOLOCK will allow dirty reads of information partially changed in transactions that have not been committed, READPAST ignores its existence completely.
There is one scenario where I think this works best: batched DELETE operations. You want to delete a lot of rows from a table, but without locking it. If you just do a delete for the entire table with some condition you will get these issues:
- the operation will be slow, especially if you are deleting on a clustered index which moves data around in the table
- if the number of deleted rows is too large (usually 5000 or more) then the operation will lock the entire table, not just the deleted rows
- if there are many rows to be deleted, the operation will take a long while, increasing the possibility of deadlocks
While there are several solutions for this, like partitioning the table and then truncating the partitions or soft deletes or designing your database to separate read and write operations, one type of implementation change that is small in scope and large is result is batched deletes. Basically, you run a flow like this:
- SELECT a small number of rows to be deleted (again, mind the 5000 limit that causes table locks, perhaps even use ROWLOCK hint)
- DELETE rows selected and their dependencies (DELETE TOP x should work as well for steps 1 and 2, but I understand in some cases this syntax automatically causes a table lock and maybe also use ROWLOCK hint)
- if the number of selected rows is larger than 0, go back to step 1
This allows SQL to lock individual rows and, if your business logic is sound, no rows should be deleted while something is trying to read or write them. However, this is not always the case, especially in high stress cases with many concurrent reads and writes. But here, if you use READPAST, then locked rows will be ignored and the next loops will have the chance to delete them.
But there is a catch. Let's take an example:
- Table has 2 rows: A and B
- Transaction 1 locks row A
- In a batched delete scenario, Transaction 2 gets the rows with READPAST and so only gets B
- Transaction 2 deletes row B and commits, and continues the loop
- Transaction 3 gets the rows with READPAST and gets no rows (A is still locked)
- Transaction 3 deletes nothing and exists the loop
- Transaction 1 unlocks row A
- Table now has 1 row: A, which should have been deleted, but it's not
There is a way to solve this: SELECT with NOLOCK and DELETE with READPAST
- this will allow to always select even locked and uncommitted rows
- this will only delete rows that are not locked
- this will never deadlock, but will loop forever as long as some rows remain locked
One more gotcha is that READPAST allows for a NOWAIT syntax, which says to immediately ignore locked rows, without waiting for a number of seconds (specified by LOCK_TIMEOUT) to see if it unlocks. Since you are doing a loop, it would be wise to wait, so that it doesn't go into a rapid loop while some rows are locked. Barring that, you might want to use READPAST NOWAIT and then add a WAITFOR DELAY '00:00:00.010' at the end of the loop to add 10 millisecond delay, but if you have a lot of rows to delete, it might make this too slow.
Enough of this, lets see some code example:
DECLARE @batchSize INT = 1000
DECLARE @nrRows INT = 1
CREATE TABLE #temp (Id INT PRIMARY KEY)
WHILE (@nrRows>0)
BEGIN
BEGIN TRAN
INSERT INTO #temp
SELECT TOP (@batchSize) Id
FROM MyTable WITH (NOLOCK)
WHERE Condition=1
SET @nrRows = @@ROWCOUNT
DELETE FROM mt
FROM MyTable mt WITH (READPAST NOWAIT)
INNER JOIN #temp t
ON mt.Id=t.Id
WAITFOR DELAY '00:00:00.010'
COMMIT TRAN
END
DROP TABLE #temp
Now the scenario goes like this:
- Table has 2 rows: A and B
- Transaction 1 locks row A
- Transaction 2 gets the rows with NOLOCK and so only gets A and B
- Transaction 2 deletes rows A and B with READPAST, but only B is deleted
- loop continues (2 rows selected)
- Transaction 3 gets the rows with NOLOCK and gets one row
- Transaction 3 deletes with READPAST with no effect (A is still locked)
- loop continues (1 rows selected)
- Transaction 1 unlocks row A
- Transaction 4 gets the rows with NOLOCK and gets row A (not locked)
- Transaction 4 deleted with READPAST and deletes row A
- loop continues (1 rows selected), but next transaction selects nothing, so loop ends (0 rows selected)
- Table now has no rows and no deadlock occurred
Hope this helps.
Comments
Be the first to post a comment