One way to get lock history in PostgreSQL

Continuation of the article " Attempting to create an analogue of ASH for PostgreSQL ".

The article will be considered and shown on specific queries and examples - what useful information can be obtained using the history of the pg_locks view.
Warning.
Due to the novelty of the topic and the incomplete testing period, the article may contain errors. Criticism and comments are strongly encouraged and expected.

Input data


Pg_locks submission history


archive_locking
CREATE TABLE archive_locking ( timepoint timestamp without time zone , locktype text , relation oid , mode text , tid xid , vtid text , pid integer , blocking_pids integer[] , granted boolean , queryid bigint ); 

In fact, the table is similar to the archive_pg_stat_activity table described in more detail here - pg_stat_statements + pg_stat_activity + loq_query = pg_ash? and here - Attempt to create an analogue of ASH for PostgreSQL.

To fill the queryid column , use the function

update_history_locking_by_queryid
 --update_history_locking_by_queryid.sql CREATE OR REPLACE FUNCTION update_history_locking_by_queryid() RETURNS boolean AS $$ DECLARE result boolean ; current_minute double precision ; start_minute integer ; finish_minute integer ; start_period timestamp without time zone ; finish_period timestamp without time zone ; lock_rec record ; endpoint_rec record ; current_hour_diff double precision ; BEGIN RAISE NOTICE '***update_history_locking_by_queryid'; result = TRUE ; current_minute = extract ( minute from now() ); SELECT * FROM endpoint WHERE is_need_monitoring INTO endpoint_rec ; current_hour_diff = endpoint_rec.hour_diff ; IF current_minute < 5 THEN RAISE NOTICE 'Current time is less than 5 minute.'; start_period = date_trunc('hour',now()) + (current_hour_diff * interval '1 hour'); finish_period = start_period - interval '5 minute' ; ELSE finish_minute = extract ( minute from now() ) / 5 ; start_minute = finish_minute - 1 ; start_period = date_trunc('hour',now()) + interval '1 minute'*start_minute*5+(current_hour_diff * interval '1 hour'); finish_period = date_trunc('hour',now()) + interval '1 minute'*finish_minute*5+(current_hour_diff * interval '1 hour') ; END IF ; RAISE NOTICE 'start_period = %', start_period; RAISE NOTICE 'finish_period = %', finish_period; FOR lock_rec IN WITH act_queryid AS ( SELECT pid , timepoint , query_start AS started , MAX(timepoint) OVER (PARTITION BY pid , query_start ) AS finished , queryid FROM activity_hist.history_pg_stat_activity WHERE timepoint BETWEEN start_period and finish_period GROUP BY pid , timepoint , query_start , queryid ), lock_pids AS ( SELECT hl.pid , hl.locktype , hl.mode , hl.timepoint , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.history_locking hl WHERE hl.timepoint between start_period and finish_period GROUP BY hl.pid , hl.locktype , hl.mode , hl.timepoint ) SELECT lp.pid , lp.locktype , lp.mode , lp.timepoint , aq.queryid FROM lock_pids lp LEFT OUTER JOIN act_queryid aq ON ( lp.pid = aq.pid AND lp.started BETWEEN aq.started AND aq.finished ) WHERE aq.queryid IS NOT NULL GROUP BY lp.pid , lp.locktype , lp.mode , lp.timepoint , aq.queryid LOOP UPDATE activity_hist.history_locking SET queryid = lock_rec.queryid WHERE pid = lock_rec.pid AND locktype = lock_rec.locktype AND mode = lock_rec.mode AND timepoint = lock_rec.timepoint ; END LOOP; RETURN result ; END $$ LANGUAGE plpgsql; 

Explanation: the value of the queryid column is updated in the history_locking table, and then when creating a new section for the archive_locking table, the value will be saved in historical values.

Output


General information on the processes as a whole.

WAITING FOR LOCKS BY LOCKTYPES


Inquiry
 WITH t AS ( SELECT locktype , mode , count(*) as total FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted GROUP BY locktype , mode ) SELECT locktype , mode , total * interval '1 second' as duration FROM t ORDER BY 3 DESC 

Example
  |  WAITING FOR LOCKS BY LOCKTYPES
 + -------------------- + ---------------------------- - + --------------------
 |  locktype |  mode |  duration
 + -------------------- + ---------------------------- - + --------------------
 |  transactionid |  ShareLock |  19:39:26
 |  tuple |  AccessExclusiveLock |  00:03:35
 + -------------------- + ---------------------------- - + --------------------

TAKINGS OF LOCKS BY LOCKTYPES


Inquiry
 WITH t AS ( SELECT locktype , mode , count(*) as total FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND granted GROUP BY locktype , mode ) SELECT locktype , mode , total * interval '1 second' as duration FROM t ORDER BY 3 DESC 

Example
  |  TAKINGS OF LOCKS BY LOCKTYPES
 + -------------------- + ---------------------------- - + --------------------
 |  locktype |  mode |  duration
 + -------------------- + ---------------------------- - + --------------------
 |  relation |  RowExclusiveLock |  51:11:10
 |  virtualxid |  ExclusiveLock |  48:10:43
 |  transactionid |  ExclusiveLock |  44:24:53
 |  relation |  AccessShareLock |  20:06:13
 |  tuple |  AccessExclusiveLock |  17:58:47
 |  tuple |  ExclusiveLock |  01:40:41
 |  relation |  ShareUpdateExclusiveLock |  00:26:41
 |  object |  RowExclusiveLock |  00:00:01
 |  transactionid |  ShareLock |  00:00:01
 |  extend |  ExclusiveLock |  00:00:01
 + -------------------- + ---------------------------- - + --------------------

Detailed information on specific queryid queries

WAITING FOR LOCKS BY LOCKTYPES BY QUERYID


Inquiry
 WITH lt AS ( SELECT pid , locktype , mode , timepoint , queryid , blocking_pids , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND NOT granted AND queryid IS NOT NULL GROUP BY pid , locktype , mode , timepoint , queryid , blocking_pids ) SELECT lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids , COUNT(*) * interval '1 second' as duration FROM lt GROUP BY lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids ORDER BY 4 

Example
  |  WAITING FOR LOCKS BY LOCKTYPES BY QUERYID
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  pid |  locktype |  mode |  started |  queryid |  blocking_pids |  duration
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  11288 |  transactionid |  ShareLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11626 |  transactionid |  ShareLock |  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12380} |  00:00:29
 |  11626 |  transactionid |  ShareLock |  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {11092} |  00:03:25
 |  11626 |  transactionid |  ShareLock |  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12213} |  00:01:55
 |  11626 |  transactionid |  ShareLock |  2019-09-17 10: 00: 21.380921 |  389015618226997618 |  {12751} |  00:00:01
 |  11629 |  transactionid |  ShareLock |  2019-09-17 10: 00: 24.331935 |  389015618226997618 |  {11092} |  00:03:22
 |  11629 |  transactionid |  ShareLock |  2019-09-17 10: 00: 24.331935 |  389015618226997618 |  {12007} |  00:00:01
 |  12007 |  transactionid |  ShareLock |  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11629} |  00:00:13
 |  12007 |  transactionid |  ShareLock |  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11092} |  00:01:10
 |  12007 |  transactionid |  ShareLock |  2019-09-17 10: 05: 03.327933 |  389015618226997618 |  {11288} |  00:00:05
 |  12213 |  transactionid |  ShareLock |  2019-09-17 10: 06: 07.328019 |  389015618226997618 |  {12007} |  00:00:10 

TAKING LOCKS BY LOCKTYPES BY QUERYID


Inquiry
 WITH lt AS ( SELECT pid , locktype , mode , timepoint , queryid , blocking_pids , MIN ( timepoint ) OVER (PARTITION BY pid , locktype ,mode ) as started FROM activity_hist.archive_locking WHERE timepoint between pg_stat_history_begin+(current_hour_diff * interval '1 hour') AND pg_stat_history_end+(current_hour_diff * interval '1 hour') AND granted AND queryid IS NOT NULL GROUP BY pid , locktype , mode , timepoint , queryid , blocking_pids ) SELECT lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids , COUNT(*) * interval '1 second' as duration FROM lt GROUP BY lt.pid , lt.locktype , lt.mode , lt.started , lt.queryid , lt.blocking_pids ORDER BY 4 

Example
  |  TAKING LOCKS BY LOCKTYPES BY QUERYID
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  pid |  locktype |  mode |  started |  queryid |  blocking_pids |  duration
 + ---------- + ------------------------- + ------------ -------- + ------------------------------ + ---------- ---------- + -------------------- + ------------------ -
 |  11288 |  relation |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11092 |  transactionid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {} |  00:03:34
 |  11288 |  relation |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {} |  00:00:10
 |  11092 |  relation |  RowExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {} |  00:03:34
 |  11092 |  virtualxid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {} |  00:03:34
 |  11288 |  virtualxid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11288 |  transactionid |  ExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34
 |  11288 |  tuple |  AccessExclusiveLock |  2019-09-17 10: 00: 00.302936 |  389015618226997618 |  {11092} |  00:03:34 

Using lock history when analyzing performance incidents.

  1. A request with queryid = 389015618226997618 executed by a process with pid = 11288 expected blocking from 2019-09-17 10:00:00 for 3 minutes.
  2. The process with pid = 11092 held the lock
  3. A process with pid = 11092 executing a query with queryid = 389015618226997618 starting from 2019-09-17 10:00:00 held the lock for 3 minutes.

Total


Now, I hope, the most interesting and useful thing begins - collecting statistics and analyzing cases on the history of expectations and locks.

In the future, I want to believe, it will turn out to be a set of some kind of note (similar to the Oracle metalink).

In general, it is for this reason that the technique used is displayed as quickly as possible for general familiarization.

In the near future I will try to put the project on github.

Source: https://habr.com/ru/post/467719/


All Articles