Chapter 9. Database Management

Table of Contents
Starting and Stopping PostgreSQL
Initializing the Filesystem
Creating and Removing a Database
Maintaining a Database
Backing Up and Restoring Data

This chapter covers several topics associated with managing a PostgreSQL database system, including starting and stopping the PostgreSQL backend, initializing the filesystem, and the creation, removal, and maintenance of individual databases. There is also a section devoted to the topic of backing up and restoring data from a database.

Starting and Stopping PostgreSQL

In this section we cover two options provided with PostgreSQL that are used to start and stop PostgreSQL. The first is a general purpose application called pg_ctl, which should function identically on any machine, regardless of the system. This script is intended to be run by the system user (e.g., the user who owns the data directory) configured to execute the postmaster backend.

The second script provided is the SysV-style script, found in the contrib/start-scripts subdirectory within the PostgreSQL source path. The installation of the SysV script is discussed in Chapter 2. By default this script is named linux, as it is intended for a Linux system's start-script directory, though in the installation instructions it is renamed to a script called postgresql in the system's service start-up directory (e.g., /etc/rc.d/init.d ).

The main functional difference between pg_ctl and the SysV-style service script is that pg_ctl is intended to be used by the user who runs the postmaster backend (e.g., postgres), whereas the service script is intended to be run by the root user.

The service script is not strictly Linux-specific, and should be compatible with most systems based on SysV start-up scripts. However, if you are not running Linux, you may prefer to stick with the pg_ctl script.

Using pg_ctl

The pg_ctl script is provided with PostgreSQL as a general control application. With it, you can start, stop, restart, or check on the status of PostgreSQL.

Here is the syntax for pg_ctl, from the -  -help option:

  pg_ctl start   [-w] [-D DATADIR ] [-s] [-l FILENAME ] [-o "OPTIONS"]
  pg_ctl stop    [-W] [-D DATADIR ] [-s] [-m SHUTDOWN-MODE ]
  pg_ctl restart [-w] [-D DATADIR ] [-s] [-m SHUTDOWN-MODE ] [-o "OPTIONS"]
  pg_ctl status  [-D DATADIR]

The following options may be passed to pg_ctl:

-w

Causes the pg_ctl application to wait until the operation has finished before returning to a command line. This option may be passed to either the start or restart action; by default, the application sends the command on to the postmaster and exits immediately for these actions.

-W

Causes the pg_ctl application not to wait until the operation has finished before returning to a command line. This option may only be passed to the stop action; by default, the application sends the stop command on to the postmaster, and waits for the action to finish before exiting.

-D DATADIR

Specifies the directory that contains the default database files. This is optional, because you may have this value already set in the PGDATA environment variable. If the PGDATA environment variable is not set, the -D flag is required.

-s

Suppresses any output from the pg_ctl application, aside from system errors. If this flag is not specified, information about the activity within the database (or specific information about startup or shutdown, depending on the action) will be displayed to the screen of the user who initiated the command.

-l FILENAME

Specifies a file FILENAME to append database activity to. This option is only available with the start action.

-m SHUTDOWN-MODE

Sets the SHUTDOWN-MODE with which to shut down the postmaster backend.

smart

Makes postmaster wait for all clients to disconnect before shutting down.

fast

Shuts postmaster down without waiting for clients to disconnect.

immediate

Shuts postmaster down more abruptly than fast mode, bypassing normal shutdown procedures. This mode causes the database to restart in recovery mode the next time it starts, which verifies the integrity of the system.

This option is of course only available to the stop and restart actions.

-o "OPTIONS"

Passes the options specified by OPTIONS (within double quotes) to be passed directly through to the postmaster (e.g., the -i flag to enable TCP/IP connections). See the Section called Calling postmaster Directly" later in this chapter for a complete list of these flags.

Note: Many of the run-time configuration options for postmaster can be found in the postgresql.conf file, which is stored in the PostgreSQL data path (e.g., /usr/local/pgsql/data). The options in this file are of a more technical nature, and should not be modified unless you are sure you understand their purpose.

Starting PostgreSQL with pg_ctl

To start PostgreSQL's postmaster backend, the start argument must be passed to pg_ctl. Remember that pg_ctl must be run by the postgres user (or whatever user you have configured to own the PostgreSQL data path).

Example 9-1 starts the postmaster backend, using the data path of /usr/local/pgsql/data. The database system starts up successfully, reports the last time the database system was shut down, and provides various debugging statements before returning the postgres user to a shell prompt.

Example 9-1. Starting PostgreSQL with pg_ctl

[postgres@booktown ~]$ pg_ctl -D /usr/local/pgsql/data start
postmaster successfully started
DEBUG:  database system was shut down at 2001-09-17 08:06:34 PDT
DEBUG:  CheckPoint record at (0, 1000524052)
DEBUG:  Redo record at (0, 1000524052); Undo record at (0, 0); Shutdown TRUE
DEBUG:  NextTransactionId: 815832; NextOid: 3628113
DEBUG:  database system is in production state

[postgres@booktown ~]$

Stopping PostgreSQL with pg_ctl

The PostgreSQL postmaster backend can be stopped in the same fashion that it is started—by passing the stop argument to pg_ctl. The application pg_ctl checks for the running postmaster process, and, if the stop command was executed by the user who owns the running processes (e.g., postgres) the server is shut down.

There are three ways in which PostgreSQL can shut down the backend: smart, fast, and immediate. These arguments are passed to pg_ctl following the -m flag, to indicate the desired shutdown mode.

A smart shutdown (the default) causes PostgreSQL to wait for all clients to first cancel their connections before shutting down. A fast shutdown causes PostgreSQL to simply shut down through its normal routine, without checking client status. An immediate shutdown bypasses the normal shutdown procedure, and will require the system to go through a recovery mode when restarted.

Warning

Never use kill -9 (kill -KILL) on the postmaster process. This can result in lost or corrupted data.

Example 9-2 calls the pg_ctl script to stop the postmaster process in fast mode. The postmaster backend will not wait for any client connections to disconnect before shutting down.

Example 9-2. Stopping PostgreSQL with pg_ctl

[postgres@booktown ~]$ pg_ctl -D /usr/local/pgsql/data stop -m fast
Fast Shutdown request at Mon Sep 17 09:23:39 2001
DEBUG:  shutting down
waiting for postmaster to shut down.....
DEBUG:  database system is shut down
done
postmaster successfully shut down
[postgres@booktown ~]$

Note: The smart shutdown is equivalent to a kill -TERM on the running postmaster process, while fast is equivalent to a kill -INT, and immediate is equivalent to a kill -QUIT.

Restarting PostgreSQL with pg_ctl

You may pass the restart argument to pg_ctl as shorthand for sequential stop and start calls to pg_ctl. This argument may also specify the -m flag to indicate the preferred shutdown mode.

PostgreSQL stores the most recently used start-up options in a temporary file called postmaster.opts, within the PostgreSQL data path (PGDATA). This file is used when pg_ctl is invoked with the restart argument to ensure that your run-time options are preserved. Avoid placing your own configurations on the postmaster.opts file, as it will be overwritten when pg_ctl is executed with the start argument.

Example 9-3 restarts the Book Town database server with the postgres user.

Example 9-3. Restarting PostgreSQL with pg_ctl

[postgres@booktown ~]$ pg_ctl -D /usr/local/pgsql/data restart
Smart Shutdown request at Mon Sep 17 08:33:51 2001
DEBUG:  shutting down
waiting for postmaster to shut down.....DEBUG:  database system is shut down
done
postmaster successfully shut down
postmaster successfully started
[postgres@booktown ~]$
DEBUG:  database system was shut down at 2001-09-17 08:33:53 PDT
DEBUG:  CheckPoint record at (0, 1000524116)
DEBUG:  Redo record at (0, 1000524116); Undo record at (0, 0); Shutdown TRUE
DEBUG:  NextTransactionId: 815832; NextOid: 3628113
DEBUG:  database system is in production state

[postgres@booktown ~]$

Checking status of PostgreSQL with pg_ctl

You may use the status argument to check the status of a running postmaster process. While not having any effect on the data itself, the data path must be known to pg_ctl. If the PGDATA environmental variable is not set, the -D flag must be passed to pg_ctl.

Example 9-4 checks the status of the Book Town PostgreSQL server.

Example 9-4. Checking status with pg_ctl

[postgres@booktown ~]$ pg_ctl -D /usr/local/pgsql/data status
pg_ctl: postmaster is running (pid: 11575)
Command line was:
/usr/local/pgsql/bin/postmaster '-D' '/usr/local/pgsql/data' '-i' '-s'
[postgres@booktown ~]$

Note: A lot of typing can be saved by making sure the PGDATA variable is set. If you intend to always use the same data directory, you may set the PGDATA variable (e.g., in the /etc/profile file, as is recommended in Chapter 2) and never have to apply the -D flag.

Using the SysV Script

The SysV-style script, if installed, operates similarly to the pg_ctl script. In fact, the SysV-style script operates as a management program that wraps around the pg_ctl command. The primary difference is that the SysV script is intended to be invoked by the root user, rather than the user who owns and runs PostgreSQL (e.g., postgres). The script itself handles the switching of the userids at the appropriate times.

Using the SysV script instead of manually invoking pg_ctl is advantageous in that it simplifies system startup and shutdown procedures. The postgresql script file in /etc/rc.d/init.d/ is a plain text file, and can be modified in any standard text editor. Within this file you may easily locate the startup and shutdown routines, and add or remove options to pg_ctl as you most commonly use them. The pg_ctl commands are simplified by using either the single administrative start or stop parameter with the postgresql script.

The instructions for installation of the postgresql script are covered in Chapter 2. Depending on your machine's configuration, there may be more than one method of invoking the script once it has been properly installed. Remember that the actual name of the SysV script file in the /etc/rc.d/init.d/ directory may be an arbitrary name, depending on how it was copied. The most common names given to this script are postgresql and postgres.

If your system supports the service command, you should be able to use it as a wrapper to the installed PostgreSQL script with the following syntax:

service postgresql { start | stop | restart | status }

The service command accepts only the parameters described in the preceding syntax. No other options are accepted. You can modify the way any of these general modes runs by editing the script (e.g., /etc/rc.d/init.d/postgresql) manually. Example 9-5 uses the service command to start PostgreSQL.

Example 9-5. Starting PostgreSQL with service command

[root@booktown ~]# service postgresql start
Starting PostgreSQL: ok
[root@booktown ~]#

Alternatively, if the service command does not exist on your system, the postgresql script can be manually invoked with its complete system path:

/etc/rc.d/init.d/postgresql { start | stop | restart | status }

Example 9-6 checks the status of PostgreSQL's backend process by directly calling the postgresql script in the complete path. This assumes that your system has its SysV start-up scripts installed in the /etc/rc.d/init.d/ directory.

Example 9-6. Checking status with postgresql script

[root@booktown ~]# /etc/rc.d/init.d/postgresql status
pg_ctl: postmaster is running (pid: 13238)
Command line was:
/usr/local/pgsql/bin/postmaster '-D' '/usr/local/pgsql/data'
[root@booktown ~]#

As you can see from the output of Example 9-6, the SysV script is just a convenient wrapper to the pg_ctl command discussed in the previous section.

Calling postmaster Directly

The postmaster program is the multi-user PostgreSQL database server backend itself. This is the process that your PostgreSQL clients actually connect to, where a connection to a postgres backend is negotiated.

This binary is typically not called manually, but is indirectly executed through either the pg_ctl or SysV script discussed earlier in this section. However, these scripts at some point call the postmaster binary directly, and it can be helpful in configuring your PostgreSQL system to know what the postmaster is, and what it does.

The postmaster can only access one database cluster at a time, though you may have several concurrent postmaster applications running on different ports with a different database cluster for each.

Here is the syntax for the postmaster program:

  postmaster [ -A { 0 | 1 } ] [ -B buffers ] [ -c name=value ] [ -d debug_level ]
             [ -D datadir ] [ -F ] [ -h hostname ] [ -i ] [ -k directory ] [ -l ]
             [ -N max_connections ] [ -o options ] [ -p port ] [ -S ] [ -n | -s ]

The following are each of the parameters available to the postmaster program, as of PostgreSQL 7.1.x:

-A { 0 | 1 }

The run-time assertion check flag. This enables debugging, if this option was enabled during compilation of PostgreSQL. This flag should only be used by knowledgeable developers working on PostgreSQL itself.

-B buffers

The number of shared-memory disk buffers that postmaster will allocate for use by the backend. By default, this is 64.

Note: The value passed to -B must be at least twice the number supplied for the -N parameter.

-c name=value

An arbitrary run-time configuration, setting name to value. Any configuration settings found in the postgresql.conf file (within the database cluster's data directory) may be over-ridden with this option.

-d debug_level

The debug level, which determines the amount of debugging output that will be logged by the backend. The default is 0. The higher the number, the more output will be generated. Values as high as 4 are reasonable for normal use, though a value of 4 will quickly take up disk space if you are logging the debug output.

Note: Unless the standard output and standard error streams from postmaster are redirected to a file (e.g., from the shell, or with the -l option to pg_ctl) all debugging information will be displayed to the controlling terminal session of the postmaster process.

-D datadir

The data directory of the intended database cluster. If this is not supplied, postmaster will use either the value of the PGDATA environment variable, or the /data path off of the path defined in the POSTGRESHOME environment variable. If neither environment variable is set, the default compile-time directory is used (e.g., /usr/local/pgsql/data).

-F

The fsync-disabling option. Using this increases performance, at the risk of data corruption, in the event that the operating system or physical hardware crashes unexpectedly. Be sure you know what you are doing before you use this flag!

-h host

The host address to listen on; by default, PostgreSQL's backends will listen on all configured addresses, including localhost.

-i

The TCP/IP client-connection flag, which allows connections via TCP/IP. If this option is not specified, the backend will accept only local domain socket connections.

-k directory

The directory for the UNIX domain socket, which postmaster will listen on for local connections. This defaults to /tmp/.

-l

The SSL flag. Use this to enable SSL connections. The -i parameter must also be given.

Note: You must have compiled PostgreSQL with SSL enabled to use the -I option.

-N max_connections

The maximum number of concurrent backend processes that postmaster can start. By default, this value is set to 32. The maximum allowed number for this value is 1024. Make sure that your allocated buffers are configured for the maximum number of concurrent backends (the -B parameter, which must be at least twice the max_connections value).

-o options

Any options that postmaster should send to the postgres backends when they are first started. These options are listed in Appendix B. Surround the options string with quotes, if more than one option is passed.

-p port

The TCP/IP port number (or socket file extension) that this instance of postmaster should listen for connections on. If this is left unspecified, the default is taken from the PGPORT environment variable, or the compile-time default (usually 5432).

-S

The silent mode flag. This will cause PostgreSQL to disassociate from the user's terminal session, start its own process group, and redirect its standard output and standard error to /dev/null.

Warning

Using the -S switch makes it very difficult to troubleshoot problems, since all tracing and logging output that would normally be generated by the postmaster and its child backend processes will be discarded.

The postmaster also accepts the following two debugging options, for interested developers:

-n

The -n flag stops postmaster from re-initializing shared data structures. A debugging tool can then be used to gain information about the memory's state at the time of the crash.

-s

The -s flag causes postmaster to use the SIGSTOP signal to stop backend processes, without terminating the processes. Using this signal will keep the backend processes in memory instead of terminating them, which allows a developer to collect a core dump from each backend process manually. Each core dump can then be individually examined for debugging information.