PostgreSQL: performance impact of extra columns

Go To StackoverFlow.com

5

Given a large table (10-100 million rows) what's the best way to add some extra (unindexed) columns to it?

  1. Just add the columns.
  2. Create a separate table for each extra column, and use joins when you want to access the extra values.

Does the answer change depending on whether the extra columns are dense (mostly not null) or sparse (mostly null)?

2012-04-04 23:14
by Daniel Winterstein


16

A column with a NULL value can be added to a row without any changes to the rest of the data page in most cases. Only one bit has to be set in the NULL bitmask. So, yes, a sparse column is much cheaper to add in most cases.

Whether it is a good idea to create a separate 1:1 table for additional columns very much depends on the use case. It is generally more expensive. For starters, there is an overhead of 28 bytes (heaptuple header plus item pointer) per row and some additional overhead per table. It is also much more expensive to JOIN rows in a query than to read them in one piece. And you need to add a primary / foreign key column plus an index on it. Splitting may be a good idea if you don't need the additional columns in most queries. Mostly it is a bad idea.

Adding a column is fast in PostgreSQL. Updating the values in the column is what may be expensive, because every UPDATE writes a new row (due to the MVCC model). Therefore, it is a good idea to update multiple columns at once.

Database page layout in the manual.

How to calculate row sizes:

2012-04-04 23:29
by Erwin Brandstetter
there is an overhead of 28 bytes (heaptuple header plus item pointer) per row and some additional overhead per table Just to confirm, does this mean that a through-table with three 4-byte integers (primary key + 2 foreign keys) would take 28+12 bytes per row - dvtan 2016-11-27 16:39
@DavidTan: Actually, a total of 44 bytes per row. 24 + 4 + 3*4 + 4 bytes of alignment padding. I added links to more detailed explanation above - Erwin Brandstetter 2016-11-27 21:01
Ads