Advanced Table Techniques

PostgreSQL provides several ways to constrain values inserted and updated within tables. One of these is the availability of table and column constraints.

PostgreSQL also supports an advanced object-relational table concept called inheritance. Inheritance allows separate tables to have an explicit parent-child relationship and, through this relationship, share columns in a hierarchical fashion.

The following sections document both types of SQL constraints, as well as the creation and application of inherited tables.

Using Constraints

Constraints are table attributes used to define rules on the type of data values allowed within specified columns. By enforcing these rules within the database, you can effectively safeguard against incorrect or inappropriate data being added to the database.

When you create a table, you can create a constraint using the CREATE TABLE command's CONSTRAINT clause. There are two types of constraints: column constraints and table constraints.

Column constraints apply only to a single column, while table constraints may apply to one or more columns. Within the CREATE TABLE command, the syntax for a column constraint follows immediately after a column definition, whereas the syntax for a table constraint exists in its own block, separated by a comma from any existing column definitions. A table constraint relies on its definition, rather than placement in the syntax, to indicate the columns affected by its restrictions.

The following sections discuss the different rules a constraint can enforce.

Column constraints

Performing the \h CREATE TABLE slash command within psql displays several detailed syntax diagrams for the constraints that may be placed on a table. Here is the syntax for a column constraint:

  [ CONSTRAINT constraint_name ]
  { NOT NULL | UNIQUE | PRIMARY KEY | DEFAULT value | CHECK ( condition ) |
    REFERENCES table [ ( column ) ]
               [ MATCH FULL | MATCH PARTIAL ]
               [ ON DELETE action ]
               [ ON UPDATE action ]
               [ DEFERRABLE | NOT DEFERRABLE ]
               [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
  }

This syntax immediately follows the data type of the column to be constrained (and precedes any commas separating it from other columns) in the CREATE TABLE statement. It may be used with as many columns as is necessary. Notice that the CONSTRAINT keyword and constraint_name identifier are optional, and may be omitted.

There are six sets of column constraint keywords that may be applied. Some of the effects of these constraints are implicitly defined by others. The constraint keywords are as follows:

NOT NULL

Specifies that the column is not allowed to contain NULL values. Using the constraint CHECK (column NOT NULL) is equivalent to using the NOT NULL constraint.

UNIQUE

Specifies that the same value may not be inserted in this column twice. Note that the UNIQUE constraint allows more than one NULL value to be in a column, as NULL values technically never match another value.

PRIMARY KEY

Implies both UNIQUE and NOT NULL constraints, and causes an index to be created on the column. A table is restricted to having only one primary key constraint.

DEFAULT value

Causes unspecified input values to be replaced with a default value of value. This value must be of the same data type as the column it applies to. PostgreSQL 7.1.x does not support subselects as default values.

CHECK condition

Defines a condition that the value must satisfy for an INSERT or UPDATE operation to succeed on that row. The condition is an expression that returns a Boolean result. Used as a column constraint, only the one column being defined can be referenced by the CHECK clause.

The sixth column constraint, REFERENCES, contains the following clauses:

REFERENCES table [ ( column ) ]

Input values to the constrained column are checked against the values of the column column within the table table. If a matching value on this column is not found in the column that it references, the INSERT or UPDATE will fail. If column is omitted, the primary key on table is used, if one exists.

This column constraint is similar to the FOREIGN KEY table discussed in the next section. Functionally, the REFERENCES column constraint is very similar to a FOREIGN KEY column constraint.

See Example 7-8 for an example of a table being created with a FOREIGN KEY table constraint.

MATCH FULL | MATCH PARTIAL

The MATCH clause affects what kind of NULL and non-NULL values are allowed to be mixed on insertion into a table whose foreign key references multiple columns. The MATCH clause is therefore only practically applicable to table constraints, though the syntax is technically valid in a column constraint as well.

MATCH FULL disallows insertion of row data whose columns contain NULL values unless all referenced columns are NULL. As of PostgreSQL 7.1.x, MATCH PARTIAL is not supported. Not specifying either clause allows NULL columns to satisfy the constraint.

Again, as column constraints may only be placed on a single column, this clause is only directly applicable to table constraints.

ON DELETE action

When a DELETE is executed on a referenced row in the referenced table, one of the following actions will be executed upon the constrained column, as specified by action:

NO ACTION

Produces an error if the reference is violated. This is the default if action is not specified.

RESTRICT

Identical to NO ACTION.

CASCADE

Removes all rows which reference the deleted row.

SET NULL

Assigns a NULL value to all referenced column values.

SET DEFAULT

Sets all referenced columns to their default values.

ON UPDATE action

When an UPDATE statement is performed on a referenced row in the referenced table, the same actions are available as with the ON DELETE clause. The default action is also NO ACTION.

Specifying CASCADE as the ON UPDATE action updates all of the rows that reference the updated row with the new value (rather than deleting them, as would be the case with ON DELETE CASCADE).

DEFERRABLE | NOT DEFERRABLE

DEFERRABLE gives you the option of postponing enforcement of the constraint to the end of a transaction rather than having it enforced at the end of each statement. Use the INITIALLY clause to specify the initial point at which the constraint will be enforced.

NOT DEFERRABLE means the enforcement of the constraint must always be done immediately as each statement is executed. Users do not have the option to defer enforcement to the end of a transaction when this is set. This is the default.

INITIALLY DEFERRED | INITIALLY IMMEDIATE

The constraint must be DEFERRABLE in order to specify the INITIALLY clause. INITIALLY DEFERRED causes enforcement to be postponed until the end of the transaction is reached, whereas INITIALLY IMMEDIATE causes constraint checking to be performed after each statement. INITIALLY IMMEDIATE is the default when the INITIALLY clause is not specified.

Example 7-7 shows how to create a table called employees with a variety of simple constraints.

Example 7-7. Creating a table with column constraints

booktown=# CREATE TABLE employees
booktown-#             (id integer PRIMARY KEY CHECK (id > 100),
booktown(#              last_name text NOT NULL,
booktown(#              first_name text);
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'employees_pkey'
for table 'employees'
CREATE

Example 7-7 creates a column called id, of type integer which has both a PRIMARY KEY constraint and a CHECK constraint. The PRIMARY KEY constraint implies both NOT NULL and UNIQUE, as well as implicitly creates the employees_ pkey index to be used on the column. The CHECK constraint verifies that the value of id is greater than 100. This means that any attempt to INSERT or UPDATE row data for the employees table with an id value of less-than or equal-to 100 will fail.

The employees table created in Example 7-7 also contains a column named last_name of type text which has a NOT NULL constraint enforced. This is a much simpler constraint; it disallows the addition of employees whose last name values are input as NULL. In other words, users must supply a last name for each employee.

Note: Conditions set with the CHECK clause must involve values of comparable data types.

Table constraints

Unlike column constraints, a table constraint can be defined on more than one column of a table. Here is the syntax to create a table constraint:

  [ CONSTRAINT constraint_name ]
  { UNIQUE ( column [, ...] ) |
    PRIMARY KEY ( column [, ...] ) |
    CHECK ( condition ) |
    FOREIGN KEY ( column [, ... ] )
                REFERENCES table [ ( column [, ... ] ) ]
                [ MATCH FULL | MATCH PARTIAL ]
                [ ON DELETE action ]
                [ ON UPDATE action ]
                [ DEFERRABLE | NOT DEFERRABLE ]
                [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

CONSTRAINT constraint_name provides an optional name for the constraint. Naming a constraint is recommended, as it provides you with a meaningful name for the purpose of the constraint, rather than an automatically generated, generic name. In the future, this name also may be useful in removing constraints (e.g., when PostgreSQL's DROP CONSTRAINT clause of the ALTER TABLE implemented). The other clauses define four general types of table constraints:

PRIMARY KEY ( column [, ...] )

The PRIMARY KEY table constraint is similar to the PRIMARY KEY column constraint. As a table constraint, PRIMARY KEY allows multiple columns to be defined in a parenthetical expression, separated by commas. An implicit index will be created across columns. The combination of values for each column specified must therefore amount to only unique and non-NULL values, as with the PRIMARY KEY column constraint.

UNIQUE ( column [, ...] )

Specifies that the combination of values for the columns listed in the expression following the UNIQUE keyword may not amount to duplicate values. NULL values are allowed more than once, as NULL technically never matches any other value.

CHECK ( condition )

Defines a condition that incoming row data must satisfy for an INSERT or UPDATE operation to succeed. The condition is an expression that returns a Boolean result. Used as a table constraint, more than one column can be referenced by the CHECK clause.

FOREIGN KEY ( column [, ... ] ) REFERENCES table [ ( column [, ... ] ) ]

Allows multiple columns to be specified as the source for the REFERENCES clause. The syntax following the FOREIGN KEY clause and its specified columns is identical to that of the column REFERENCES constraint.

Example 7-8 creates the Book Town editions table. It creates three table constraints. A detailed explanation follows the example.

Example 7-8. Creating a table with table constraints

booktown=# CREATE TABLE editions
booktown-#              (isbn text,
booktown(#              book_id integer,
booktown(#              edition integer,
booktown(#              publisher_id integer,
booktown(#              publication date,
booktown(#              type char,
booktown(#              CONSTRAINT pkey PRIMARY KEY (isbn),
booktown(#              CONSTRAINT integrity CHECK (book_id IS NOT NULL
booktown(#                                          AND edition IS NOT NULL),
booktown(#              CONSTRAINT book_exists FOREIGN KEY (book_id)
booktown(#                         REFERENCES books (id)
booktown(#                         ON DELETE CASCADE
booktown(#                         ON UPDATE CASCADE);
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'pkey' for table
'editions'
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
CREATE

The first constraint, pkey is a PRIMARY KEY constraint on the isbn column, and behaves identically to a PRIMARY KEY column constraint (because only one column target is supplied).

The constraint named integrity uses the CHECK clause to ensure that neither the book_id nor edition columns ever contain NULL values.

Finally, the book_exists constraint uses the FOREIGN KEY and REFERENCES clauses to verify that the book_id value always exists within the books table in the id column. Furthermore, since the CASCADE keyword is supplied for both the ON DELETE and ON UPDATE clauses, any modifications to the id column in the books table will also be made to the corresponding rows of the editions table, and any deletions from books will result in corresponding rows being deleted from editions.

Notice that both an implicit index named editions_pkey on the isbn column and an implicit trigger are created from these table constraints. The implicit index is used in the enforcement of the PRIMARY KEY constraint. The implicit trigger enforces the FOREIGN KEY constraint.

Adding a constraint

The ALTER TABLE command is intended to allow the addition of table constraints to existing tables. As of PostgreSQL 7.1.x, however, only the addition of CHECK and FOREIGN KEY constraints is supported.

Here is the syntax to add a constraint with ALTER TABLE:

  ALTER TABLE table
        ADD [ CONSTRAINT name ]
        { CHECK ( condition ) |
          FOREIGN KEY ( column [, ... ] )
                      REFERENCES table [ ( column [, ... ] ) ]
                      [ MATCH FULL | MATCH PARTIAL ]
                      [ ON DELETE action ]
                      [ ON UPDATE action ]
                      [ DEFERRABLE | NOT DEFERRABLE ]
                      [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
        }

Example 7-9 adds a FOREIGN KEY constraint to the Book Town books table's subject_id column. This constraint references the id column within the subjects table, and will ensure that no values are inserted or modified on the subject_id column with a value which cannot be found in the subjects table's id column.

Example 7-9. Adding a constraint to an existing table

booktown=# ALTER TABLE books ADD CONSTRAINT legal_subjects
booktown-#                   FOREIGN KEY (subject_id)
booktown-#                   REFERENCES subjects (id);
NOTICE:  ALTER TABLE ... ADD CONSTRAINT will create implicit trigger(s) for
FOREIGN KEY check(s)
CREATE

Removing a constraint

As of PostgreSQL 7.1.x, constraints may not be directly removed from a table. The only way to achieve the net effect of dropping a constraint is to create a copy of the table which is nearly identical to the original, with any unwanted constraints omitted. The data can then be copied from the original table to the new table, and the tables renamed using the ALTER TABLE command so that the new copy replaces the original table.

Warning

Be aware of who is connected to, and accessing, any tables that you wish to restructure with this work-around. Data should not be inserted or modified at any time in the middle of the operation; therefore, you may need to temporarily disallow connection to the database if it is a highly active table, make the modifications, and finally, restart the system when finished.

Example 7-10 demonstrates this work-around method for removing a constraint by effectively removing the legal_subjects FOREIGN KEY constraint from the books table (see Example 7-9). Notice that the books_id_pkey is removed before the new table is created, so that the new table can be created with an index named books_id_ pkey. This is not necessary, but for the sake of consistency we want to keep the primary key index name the same.

Example 7-10. Removing a constraint

booktown=# DROP INDEX books_id_pkey;
DROP
booktown=# CREATE TABLE new_books
booktown-#              (id integer CONSTRAINT books_id_pkey PRIMARY KEY,
booktown(#              title text NOT NULL,
booktown(#              author_id integer,
booktown(#              subject_id integer);
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'books_id_pkey'
for table 'new_books'
CREATE
booktown=# INSERT INTO new_books SELECT * FROM books;
INSERT 0 15
booktown=# ALTER TABLE books RENAME TO old_books;
ALTER
booktown=# ALTER TABLE new_books RENAME TO books;
ALTER

Inheritance

PostgreSQL supports an advanced object-relational mechanism known as inheritance. Inheritance allows a table to inherit some of its column attributes from one or more other tables, creating a parent-child relationship. This causes the child table to have each of the same columns and constraints as its inherited table (or tables), as well as its own defined columns.

When performing a query on an inherited table, the query can be instructed to retrieve either all rows of a table and its descendants, or just the rows in the parent table itself. The child table, on the other hand, will never return rows from its parent.

Creating a child table

A child table is created with the CREATE TABLE SQL command by using the INHERITS clause. This clause consists of the INHERITS keyword, and the name of the table (or tables) from which to inherit.

Here is the portion of the CREATE TABLE syntax which applies to inheritance:

  CREATE TABLE childtable definition
         INHERITS ( parenttable [, ...] )

In this syntax, childtable is the name of the new table to be created, definition is the complete table definition (with all the ordinary CREATE TABLE clauses), and parenttable is the table whose column structure is to be inherited. More than one parent table may be specified by separating table names with commas.

Example 7-11 creates a table called distinguished_authors with a single column named award of type text. Since it is instructed to inherit from the authors table by the INHERITS clause it actually is created with four columns; the first three from authors, and the fourth awards column.

Example 7-11. Creating a child table

booktown=# CREATE TABLE distinguished_authors (award text)
booktown-#              INHERITS (authors);
CREATE
booktown=# \d distinguished_authors
  Table "distinguished_authors"
 Attribute  |  Type   | Modifier
------------+---------+----------
 id         | integer | not null
 last_name  | text    |
 first_name | text    |
 award      | text    |

As you can see, even though Example 7-11 specified only one column, the distinguished_authors table inherited all of the columns that were originally in the authors table.

Using inherited tables

The relationship between the shared columns of a parent and child table is not purely cosmetic. Inserted values on the distinguished_authors table will also be visible in the authors table, its parent. However, in the authors table, you will only see the three columns which were inherited. When querying a parent table, you can use the ONLY keyword to specify that rows from child tables are to be omitted from the query results.

Note: Parent rows are never visible within a query on one of its child tables. Therefore, using the ONLY keyword on a child table would only have an effect if that child table were also inherited by another table, making it effectively both a parent and a child.

Example 7-12 inserts a new author named Neil Simon with the award of Pulitzer Prize into the distinguished_authors table. Notice that the first three inserted values are shared between the parent and child tables.

Example 7-12. Inserting into a child table

booktown=# INSERT INTO distinguished_authors
booktown-#             VALUES (nextval('author_ids'),
booktown(#             'Simon', 'Neil', 'Pulitzer Prize');
INSERT 3629421 1

Since the first three columns of the distinguished_authors table are inherited from the authors table, this author will also appear implicitly as a regular author in the authors table (though the data is not literally inserted into the authors table). Only the distinguished_authors table will show information about awards, however, as inheritance only works one way (descending from parent to child).

Example 7-13 executes three SELECT statements. Each of these queries chooses a different target for the FROM clause, while using the same search criteria in the WHERE clause.

Example 7-13. Selecting with inheritance

booktown=# SELECT * FROM distinguished_authors
booktown-#          WHERE last_name = 'Simon';
  id   | last_name | first_name |     award
-------+-----------+------------+----------------
 25043 | Simon     | Neil       | Pulitzer Prize
(1 row)

booktown=# SELECT * FROM authors WHERE last_name = 'Simon';
  id   | last_name | first_name
-------+-----------+------------
 25043 | Simon     | Neil
(1 row)

booktown=# SELECT * FROM ONLY authors WHERE last_name = 'Simon';
 id | last_name | first_name
----+-----------+------------
(0 rows)

Each of the three queries in Example 7-13 look for rows where the last_name column matches the string constant Simon. The first query selects from the distinguished_authors table, which the data was originally inserted into (in Example 7-12), and the requested row is returned.

The second query in Example 7-13 selects from the parent of distinguished_authors, which is the authors table. Again, a row is retrieved, though this row includes only the columns which are inherited by the distinguished_authors.

It is important to understand that this data was not literally inserted into both tables, but simply made visible because of the inheritance relationship. This is illustrated by the third and final query in Example 7-13, which prefixes the authors table name with the ONLY keyword. This keyword indicates that rows are not to be received from child tables, but only from the specified parent; as a result, no rows are returned by the query.

Some constraints may appear to be violated because of the nature of inherited tables. For example, a column with a UNIQUE constraint placed on it may appear to have the same value twice by including data from inherited children. Make careful use of constraints and inheritance, as a child table does not literally violate such a constraint, though it can appear to if the ONLY keyword is not used when selecting from the parent table.

Modifying inherited tables

As covered in the preceding section, adding values into child and parent tables is fairly straightforward. An insertion of values into a child table will cause values in inherited columns to appear as values in the parent table, though the data itself physically resides in the child table. Insertion of values into a parent table has no effect whatsoever on the child table.

Likewise, modifying values in a child table is self-explanatory: only the values in the child table are modified, while any values literally in the parent table are unmodified. This is because the data is not literally shared between tables, but can only be viewed through the hierarchy. A retrieval of rows on the parent table without the ONLY clause will still show both the parent rows, and the modified child rows.

The effect of modifying existing rows in a parent table is less obvious than the effect of modifying existing rows in a child table. UPDATE and DELETE statements executed on a parent table will, by default, affect not only rows in the parent table, but also any child tables that match the criteria of the statement. Example 7-14 performs an UPDATE statement on the authors table. Notice that the row data in the distinguished_authors table is actually affected by this statement.

Example 7-14. Modifying parent and child tables

booktown=# UPDATE authors SET first_name = 'Paul'
booktown-#                WHERE last_name = 'Simon';
UPDATE 1
booktown=# SELECT * FROM distinguished_authors;
  id   | last_name | first_name |     award
-------+-----------+------------+----------------
 25043 | Simon     | Paul       | Pulitzer Prize
(1 row)

The ONLY keyword can be used with UPDATE and DELETE in a fashion similar to its use with the SELECT command in order to prevent the type of cascading modification illustrated in Example 7-14. The ONLY keyword should always precede the inherited table name in the SQL syntax.

Example 7-15 demonstrates the use of the ONLY keyword. First, the example inserts a new row for Dr. Seuss into the distinguished_authors table, along with a reference to his Pulitzer Prize. This results in the authors table appearing to have two separate entries for the same author. The old entry (that exists physically in the authors table) is then removed by use of the DELETE SQL command combined with the ONLY keyword.

Example 7-15. Modifying parent tables with ONLY

booktown=# INSERT INTO distinguished_authors
booktown-#             VALUES (1809, 'Geisel', 
booktown(#             'Theodor Seuss', 'Pulitzer Prize');
INSERT 3629488 1
booktown=# SELECT * FROM authors
booktown-#          WHERE last_name = 'Geisel';
  id  | last_name |  first_name
------+-----------+---------------
 1809 | Geisel    | Theodor Seuss
 1809 | Geisel    | Theodor Seuss
(2 rows)

booktown=# DELETE FROM ONLY authors 
booktown-#        WHERE last_name = 'Geisel';
DELETE 1

The end result of Example 7-15 is that the record for Dr. Seuss is added to the distinguished_authors table, and subsequently removed from the authors table, as follows:

booktown=# SELECT * FROM authors 
booktown-#          WHERE last_name = 'Geisel';
  id  | last_name |  first_name
------+-----------+---------------
 1809 | Geisel    | Theodor Seuss
(1 row)

booktown=# SELECT * FROM distinguished_authors 
booktown-#          WHERE last_name = 'Geisel';
  id  | last_name |  first_name   |     award
------+-----------+---------------+----------------
 1809 | Geisel    | Theodor Seuss | Pulitzer Prize
(1 row)