Imagine a database that tracks payments between customers.
Say I've got a Customer table:
Customer
----------
CustomerID
Name
and a Transaction table
Transaction
------------
BuyerID
SellerID
Amount
Where BuyerID and SellerID are both Foreign Key references to the CustomerID column of the Customer table.
In my current (analogous) situation, the Transaction table is large (500 Million Rows), even though the Customer table is small (2000 rows). Deleting a row from the Customer table, however, takes a very long time, because the database has to scan the Transaction table to see whether the Customer has any referencing Transactions (In fact, it has to do it twice - one to check for BuyerID and one for SellerID). The Transaction table is not Indexed on either BuyerID or SellerID (The real table is indexed on a combination of Buyer, Seller, and a few other columns)
I know I can drop all the foreign key constraints, delete the rows, and then re-add the constraints. Will that be any faster than just doing the DELETE FROM
with the foreign keys enabled? Are there any other ways to speed up the delete operation that I'm missing.
You should index BuyerID
and SellerID
in your Transaction
table...
For a further explanation as to why you should be indexing your foreign keys, read Kimberly Tripp's excellent article on the subject.
go ahead and add the extra two indexes. leave the constraints alone.
Generally if I have sales transactions for a customer, I don't want to ever delete them or the customer. This is exactly why you havea foreign key constraint, so you don't delete the customer. That is just messing with your financial reporting (Why did sales for 2011 suddenly go down 20%, oh we deleted some records from the database, not a good converasation to have.). What you want is to make a customer inactive not to delete them usually in this case.
Given that Ryan says this is a database in development and not in production where otehrs are likely to be entering data to it at the same time, I would drop the FKs, put the customer ids I intend to delete into a work table and then do the deletes to the parent and all child tables. This way you can delete from the 500 million row table in batches if you need to. Then when you are done put the FKs back on.
If I am working with 500 millions rows, first thing I will do is - remove foreign key reference and have centralize place like stored procedure to control delete functionality.