PostgreSQL/Partitioning
< PostgreSQLIf you have a table with a very huge amount of data, it may be helpfull to scatter the data to different physical tables with a common data structure. In use cases, where INSERTs and SELECTs concern only one of those tables and DELETEs concern really all rows of another table, you can get great performance benefits from partitioning. Typically this is the case, if there is any timeline for the rows.
Partitioning uses the INHERIT feature. First, you define a master table.
CREATE TABLE log ( id int not null, logdate date not null, message varchar(500) );
Next, you create the partitions with the same structure as the master and ensure, that only rows within the expected data range can be stored there.
CREATE TABLE log_2015_01 (CHECK (logdate >= DATE '2015-01-01' AND logdate < DATE '2015-02-01')) INHERITS (log); CREATE TABLE log_2015_02 (CHECK (logdate >= DATE '2015-02-01' AND logdate < DATE '2015-03-01')) INHERITS (log); ... CREATE TABLE log_2015_12 (CHECK (logdate >= DATE '2015-12-01' AND logdate < DATE '2016-01-01')) INHERITS (log); CREATE TABLE log_2016_01 (CHECK (logdate >= DATE '2016-01-01' AND logdate < DATE '2016-02-01')) INHERITS (log); ...
It's a good idea to create an index.
CREATE INDEX log_2015_01_idx ON log_2015_01 (logdate); CREATE INDEX log_2015_02_idx ON log_2015_02 (logdate); ...
We need a function, which transfers rows into the appropriate partition.
CREATE OR REPLACE FUNCTION log_ins_function() RETURNS TRIGGER AS $$ BEGIN IF (NEW.logdate >= DATE '2015-01-01' AND NEW.logdate < DATE '2015-02-01' ) THEN INSERT INTO log_2015_01 VALUES (NEW.*); ELSIF (NEW.logdate >= DATE '2015-02-01' AND NEW.logdate < DATE '2015-03-01' ) THEN INSERT INTO log_2015_02 VALUES (NEW.*); ELSIF ... ... END IF; RETURN NULL; END; $$ LANGUAGE plpgsql;
The function is called by a trigger.
CREATE TRIGGER log_ins_trigger BEFORE INSERT ON log FOR EACH ROW EXECUTE PROCEDURE log_ins_function();
When this is done, new rows mostly will go to the newest partition. And after some years you can drop old partitions as a whole. This dropping shall be done with the command DROP TABLE - not with a DELETE ... command. The DROP command is much faster than the DELETE command as it removes everything in one single step instead of touching every single row. For SELECT commands the query optimizer has the chance to avoid scanning unnecessary tables.