How PostgreSQL Processes a Query

by Bruce Momjian

Click on an item to see more detail or look at the full index.

A query comes to the backend via data packets arriving through TCP/IP or Unix Domain sockets. It is loaded into a string, and passed to the parser, where the lexical scanner, scan.l, breaks the query up into tokens(words). The parser uses gram.y and the tokens to identify the query type, and load the proper query-specific structure, like CreateStmt or SelectStmt.

The query is then identified as a Utility query or a more complex query. A Utility query is processed by a query-specific function in commands. A complex query, like SELECT, UPDATE, and DELETE requires much more handling.

The parser takes a complex query, and creates a Query structure that contains all the elements used by complex queries. Query.qual holds the WHERE clause qualification, which is filled in by transformWhereClause(). Each table referenced in the query is represented by a RangeTableEntry, and they are linked together to form the range table of the query, which is generated by transformFromClause(). Query.rtable holds the query's range table.

Certain queries, like SELECT, return columns of data. Other queries, like INSERT and UPDATE, specify the columns modified by the query. These column references are converted to Resdom entries, which are placed in target list entries, and linked together to make up the target list of the query. The target list is stored in Query.targetList, which is generated by transformTargetList().

Other query elements, like aggregates(SUM()), GROUP BY, and ORDER BY are also stored in their own Query fields.

The next step is for the Query to be modified by any VIEWS or RULES that may apply to the query. This is performed by the rewrite system.

The optimizer takes the Query structure and generates an optimal Plan, which contains the operations to be performed to execute the query. The path module determines the best table join order and join type of each table in the RangeTable, using Query.qual(WHERE clause) to consider optimal index usage.

The Plan is then passed to the executor for execution, and the result returned to the client. The Plan actually as set of nodes, arranged in a tree structure with a top-level node, and various sub-nodes as children.

There are many other modules that support this basic functionality. They can be accessed by clicking on the flowchart.

Another area of interest is the shared memory area, which contains data accessable to all backends. It has recently used data/index blocks, locks, backend process information, and lookup tables for these structures:

ShmemIndex - lookup shared memory addresses using structure names
Buffer Descriptor - control header for buffer cache block
Buffer Block - data/index buffer cache block
Shared Buffer Lookup Table - lookup of buffer cache block addresses using table name and block number( BufferTag)
MultiLevelLockTable (ctl) - control structure for each locking method. Currently, only multi-level locking is used(LOCKMETHODCTL).
MultiLevelLockTable (lock hash) - the LOCK structure, looked up using relation, database object ids(LOCKTAG). The lock table structure contains the lock modes(read/write or shared/exclusive) and circular linked list of backends (PROC structure pointers) waiting on the lock.
MultiLevelLockTable (xid hash) - lookup of LOCK structure address using transaction id, LOCK address. It is used to quickly check if the current transaction already has any locks on a table, rather than having to search through all the held locks. It also stores the modes (read/write) of the locks held by the current transaction. The returned XIDLookupEnt structure also contains a pointer to the backend's PROC.lockQueue.
Proc Header - information about each backend, including locks held/waiting, indexed by process id

Each data structure is created by calling ShmemInitStruct(), and the lookups are created by ShmemInitHash().

Maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
Last updated: Mon Aug 10 10:48:06 EDT 1998