Transactions and Cursors

PostgreSQL uses a multi-version approach to transactions within the database. A transaction is a formal term for a SQL statement's effects being synchronized with the "current" data in the database. This doesn't necessarily mean that the data is written to disk, but it becomes part of the "current" set of information stored in the database. When a statement's results have effectively been processed in the current state of the database, the transaction is considered to be committed.

The issue of two users attempting to commit changes to the same database object is obviously a potential concern, as their modifications may be exclusive to one another. Some relational database systems rely on automatic locking to prevent such conflicts.

Locking is a mechanism that disallows selecting from a database object while it is being modified, and vice versa. Locking presents several obvious performance concerns. For example, data which is being updated will not be selectable until the update transaction has completed.

PostgreSQL's Multi-Version Concurrency Control (MVCC), however, allows for SQL statements to be performed within transaction-deferred blocks. This means that each connection to PostgreSQL essentially maintains a temporary snapshot of the database for objects modified within a transaction block, before the modifications are committed.

Without explicitly opening a transaction block, all SQL statements issued to PostgreSQL are auto-committed, meaning that the database is synchronized with the results of the statement immediately upon execution. When a transaction block is used, however, changes made to the database will not be visible to other users until the block is committed. This allows for several changes to various objects within a database to be made tentatively. They can then be either committed all at once, or rolled back.

Rolling back a transaction returns the state of any affected objects to the condition they were in before the transaction block began. This can be useful when recovering from a partially failed operation, in that any modifications made part-way into a process can be undone. Rolled back transactions are never actually committed; while the process appears to undo modifications to the user who performed the rollback, other users connected to the same database never know the difference.

PostgreSQL also supports cursors, which are flexible references to fully executed SQL queries. A cursor is able to traverse up and down a result set, and only retrieve those rows which are explicitly requested. Used properly, a cursor can aid an application in efficiently use a static result set. A cursor may only be executed within a transaction block.

The following sections cover the basic use of transactions and cursors. They show how to begin, commit, and roll back transactions, and also how to declare, move, and fetch data from a cursor.

Using Transaction Blocks

Transaction blocks are explicitly started with the BEGIN SQL command. This keyword may optionally be followed by either of the noise terms WORK or TRANSACTION, though they have no effect on the statement, or the transaction block.

Example 7-38 begins a transaction block within the booktown database.

Example 7-38. Beginning a transaction

booktown=# BEGIN;
BEGIN

Any SQL statement made after the BEGIN SQL command will appear to take effect as normal to the user making the modifications. As stated earlier, however, other users connected to the database will be oblivious to the modifications that appear to have been made from within your transaction block until it is committed.

Transaction blocks are closed with the COMMIT SQL command, which may be followed by either of the optional noise terms WORK or TRANSACTION. Example 7-39 uses the COMMIT SQL command to synchronize the database system with the result of an UPDATE statement.

Example 7-39. Committing a transaction

booktown=# BEGIN;
BEGIN
booktown=# UPDATE subjects SET location = NULL
booktown-#                 WHERE id = 12;
UPDATE 1
booktown=# SELECT location FROM subjects WHERE id = 12;
 location
----------

(1 row)

booktown=# COMMIT;
COMMIT

Again, even though the SELECT statement immediately reflects the result of the UPDATE statement in Example 7-39, other users connected to the same database will not be aware of that modification until after the COMMIT statement is executed.

To roll back a transaction, the ROLLBACK SQL command is used. Again, either of the optional noise terms WORK or TRANSACTION may follow the ROLLBACK command.

Example 7-40 begins a transaction block, makes a modification to the subjects table, and verifies the modification within the block. The transaction is then rolled back, returning the subjects table to the state that it was in before the transaction block began.

Example 7-40. Rolling back a transaction

booktown=# BEGIN;
BEGIN
booktown=# SELECT * FROM subjects WHERE id = 12;
 id | subject  | location
----+----------+----------
 12 | Religion |
(1 row)

booktown=# UPDATE subjects SET location = 'Sunset Dr'
booktown-#                 WHERE id = 12;
UPDATE 1
booktown=# SELECT * FROM subjects WHERE id = 12;
 id | subject  | location
----+----------+-----------
 12 | Religion | Sunset Dr
(1 row)

booktown=# ROLLBACK;
ROLLBACK
booktown=# SELECT * FROM subjects WHERE id = 12;
 id | subject  | location
----+----------+----------
 12 | Religion |
(1 row)

PostgreSQL is very strict about errors in SQL statements inside of transaction blocks. Even an innocuous parse error, such as that shown in Example 7-41, will cause the transaction to enter into the ABORT STATE. This means that no further statements may be executed until either the COMMIT or ROLLBACK command is used to end the transaction block.

Example 7-41. Recovering from the abort state

booktown=# BEGIN;
BEGIN
booktown=# SELECT * FROM;
ERROR:  parser: parse error at or near ";"
booktown=# SELECT * FROM books;
NOTICE:  current transaction is aborted, queries ignored until end of transaction block
*ABORT STATE*
booktown=# COMMIT;

Using Cursors

A SQL cursor in PostgreSQL is a read-only pointer to a fully executed SELECT statement's result set. Cursors are typically used within applications that maintain a persistent connection to the PostgreSQL backend. By executing a cursor, and maintaining a reference to its returned result set, an application can more efficiently manage which rows to retrieve from a result set at different times, without having to re-execute the query with different LIMIT and OFFSET clauses.

Used within a programming Application Programming Interface (API), cursors are often used to allow multiple queries to be executed to a single database backend, which are then tracked and managed separately by the application through references to the cursor. This prevents having to store all of the results in memory within the application.

Cursors are often abstracted within a programming API (such as libpq++'s PgCursor class), though they can also be directly created and manipulated through standard SQL commands. For the sake of generality, this section uses psql to demonstrate the fundamental concepts of cursors with SQL. The four SQL commands involved with PostgreSQL cursors are DECLARE, FETCH, MOVE and CLOSE.

The DECLARE command both defines and opens a cursor, in effect defining the cursor in memory, and then populating the cursor with information about the result set returned from the executed query. The FETCH command lets you pull rows from an open cursor. The MOVE command moves the "current" location of the cursor within the result set, and the CLOSE command closes the cursor, freeing up any associated memory.

Note: If you are interested in learning how to use cursors within a particular API, consult that API's documentation.

Declaring a cursor

A cursor is both created and executed with the DECLARE SQL command. This process is also referred to as "opening" a cursor. A cursor may be declared only within an existing transaction block, so you must execute a BEGIN command prior to declaring a cursor. Here is the syntax for DECLARE:

  DECLARE cursorname [ BINARY ] [ INSENSITIVE ] [ SCROLL ]
                     CURSOR FOR query
                     [ FOR { READ ONLY | UPDATE [ OF column [, ...] ] } ]

DECLARE cursorname

cursorname is the name of the cursor to create.

[ BINARY ]

The optional BINARY keyword causes output to be retrieved in binary format instead of standard ASCII; this can be more efficient, though it is only relevant to custom applications, as clients such as psql are not built to handle anything but text output.

[ INSENSITIVE ] [ SCROLL ]

The INSENSITIVE and SCROLL keywords exist for compliance with the SQL standard, though they each define PostgreSQL's default behavior and are never necessary. The INSENSITIVE SQL keyword exists to ensure that all data retrieved from the cursor remains unchanged from other cursors or connections. Since PostgreSQL requires that cursors be defined within transaction blocks, this behavior is already implied. The SCROLL SQL keyword exists to specify that multiple rows can be selected at a time from the cursor. This is the default in PostgreSQL, even if unspecified.

CURSOR FOR query

query is the complete query whose result set will be accessible by the cursor, when executed.

[ FOR { READ ONLY | UPDATE [ OF column [, ...] ] } ]

As of PostgreSQL 7.1.x, cursors may only be defined as READ ONLY, and the FOR clause is therefore superfluous.

Example 7-42 begins a transaction block with the BEGIN keyword, and opens a cursor named all_books with SELECT * FROM books as its executed SQL statement.

Example 7-42. Declaring a cursor

booktown=# BEGIN;
BEGIN
booktown=# DECLARE all_books CURSOR
booktown-#         FOR SELECT * FROM books;
SELECT

The SELECT message returned from Example 7-42 indicates that the statement was executed successfully. This means that the rows retrieved by the query are now accessible from the all_books cursor.

Fetching from a cursor

You may retrieve rows from a cursor with the FETCH SQL command. Here is the syntax for the FETCH SQL command:

  FETCH [ FORWARD | BACKWARD | RELATIVE ]
        [ # | ALL | NEXT | PRIOR ]
        { IN | FROM } cursor

In this syntax diagram, cursor is the name of the cursor from which to retrieve row data. A cursor always points to a "current" position in the executed statement's result set, and rows can be retrieved either ahead or behind of the current location. The FORWARD and BACKWARD keywords may be used to specify the direction, though the default is forward. The RELATIVE keyword is a noise term made available for SQL92 compliance.

Warning

The ABSOLUTE keyword can be used, but absolute cursor positioning and fetching are not supported as of PostgreSQL 7.1.x; the cursor will still use relative positioning and provide a notice regarding the state of absolute positioning being unsupported.

Following the direction you may optionally specify a quantity. This quantity may either be a literal number of rows to be returned (in the form of an integer constant) or one of several keywords. The ALL keyword causes returns all rows from the current cursor position. The NEXT keyword (the default) returns the next single row from the current cursor position. The PRIOR keyword causes the single row preceding the current cursor position to be returned.

There is no functional difference between the IN and FROM keywords, but one of these must be specified.

Example 7-43 fetches the first four rows stored in the result set pointed to by the all_books cursor. As a direction is not specified, FORWARD is implied. It then uses a FETCH statement with the NEXT keyword to select the fifth row, and then another FETCH statement with the PRIOR keyword to again select the fourth retrieved row.

Example 7-43. Fetching rows from a cursor

booktown=# FETCH 4 FROM all_books;
  id  |         title         | author_id | subject_id
------+-----------------------+-----------+------------
 7808 | The Shining           |      4156 |          9
 4513 | Dune                  |      1866 |         15
 4267 | 2001: A Space Odyssey |      2001 |         15
 1608 | The Cat in the Hat    |      1809 |          2
(4 rows)

booktown=# FETCH NEXT FROM all_books;
  id  |            title            | author_id | subject_id
------+-----------------------------+-----------+------------
 1590 | Bartholomew and the Oobleck |      1809 |          2
(1 row)

booktown=# FETCH PRIOR FROM all_books;
  id  |       title        | author_id | subject_id
------+--------------------+-----------+------------
 1608 | The Cat in the Hat |      1809 |          2
(1 row)

Moving a cursor

A cursor maintains a position in the result set of its referenced SELECT statement. You can use the MOVE command to move the cursor to a specified row position in that result set. Here is the syntax for the MOVE command:

  MOVE [ FORWARD | BACKWARD | RELATIVE ]
       [ # | ALL | NEXT | PRIOR ]
       { IN | FROM } cursor

As you can see, the syntax is very similar to FETCH. However, the MOVE command does not retrieve any rows and only moves the current position of the specified cursor. The amount is specified by either an integer constant, the ALL keyword (to move as far as can be moved in the specified direction), NEXT, or PRIOR. Example 7-44 moves the cursor forward 10 rows from its current position in the result set.

Example 7-44. Moving a cursor

booktown=# MOVE FORWARD 10
booktown-#      IN all_books;
MOVE

Closing a cursor

Use the CLOSE command to explicitly close an open cursor. A cursor is also implicitly closed if the transaction block that it resides within is committed with the COMMIT command, or rolled back with the ROLLBACK command.

Here is the syntax for CLOSE, where cursorname is the name of the cursor intended to be closed:

  CLOSE cursorname

Example 7-45 closes the all_books cursor, freeing the associated memory, and rendering the cursor's results inaccessible.

Example 7-45. Closing a cursor

booktown=# CLOSE all_books;
CLOSE
booktown=# COMMIT;
COMMIT