As a short supplement to the article Attempt to create an analogue of ASH for PostgreSQL .A task
You need to link the view history pg_stat_statemenets, pg_stat_activity. As a result, using the history of execution plans from the log_query service table, you can get a lot of useful information for use in resolving performance incidents and optimizing queries.
Warning.
In connection with the continuation of testing and development, the article cannot pretend to describe a ready-made industrial solution.
Criticism and implementation comments are welcome and expected.
Input data
History_pg_stat_activity table Pg_stat_db_queries table CREATE TABLE pg_stat_db_queries ( database_id integer , queryid bigint , query text , max_time double precision );(
Materialized view of mvw_pg_stat_queries CREATE MATERIALIZED VIEW public.mvw_pg_stat_queries AS SELECT t.queryid, t.max_time, t.query FROM public.dblink('LINK1'::text, 'SELECT queryid , max_time , query FROM pg_stat_statements WHERE dbid=(SELECT oid FROM pg_database WHERE datname=current_database() ) AND max_time >= 0 '::text) t(queryid bigint, max_time double precision, query text) WITH NO DATA;
Log_query table CREATE TABLE log_query ( id integer , queryid bigint , query_md5hash text , database_id integer , timepoint timestamp without time zone , query text , explained_plan text[] , plan_md5hash text , explained_plan_wo_costs text[] , plan_hash_value text , ip text, port text , pid integer );
General algorithm
Refresh pg_stat_db_queries table
Refresh Material View mvw_pg_stat_queries CREATE OR REPLACE FUNCTION refresh_pg_stat_queries_list( database_id int) RETURNS BOOLEAN AS $$ DECLARE result BOOLEAN ; database_rec record ; BEGIN SELECT * INTO database_rec FROM endpoint e JOIN database d ON e.id = d.endpoint_id WHERE d.id = database_id ; IF NOT database_rec.is_need_monitoring THEN RAISE NOTICE 'NO NEED MONITORING FOR database_id=%',database_id; return TRUE ; END IF ; EXECUTE 'SELECT dblink_connect(''LINK1'',''host='||database_rec.host||' port=5432 dbname='||database_rec.name|| ' user='||database_rec.s_name||' password='||database_rec.s_pass|| ' '')'; REFRESH MATERIALIZED VIEW mvw_pg_stat_queries ; PERFORM dblink_disconnect('LINK1'); RETURN result; END $$ LANGUAGE plpgsql;
Populate pg_stat_db_queries table CREATE OR REPLACE FUNCTION refresh_pg_stat_db_queries( ) RETURNS BOOLEAN AS $$ DECLARE result BOOLEAN ; database_rec record ; pg_stat_rec record ; BEGIN TRUNCATE pg_stat_db_queries; FOR database_rec IN SELECT * FROM database d LOOP IF NOT database_rec.is_need_monitoring THEN RAISE NOTICE 'NO NEED MONITORING FOR database_id=%',database_rec.id; CONTINUE ; END IF ; PERFORM refresh_pg_stat_queries_list( database_rec.id ) ; FOR pg_stat_rec IN SELECT * FROM mvw_pg_stat_queries LOOP INSERT INTO pg_stat_db_queries ( database_id , queryid , query , max_time ) VALUES ( database_rec.id , pg_stat_rec.queryid , pg_stat_rec.query , pg_stat_rec.max_time); END LOOP; END LOOP; RETURN TRUE; END $$ LANGUAGE plpgsql;
As a result, the table contains normalized query texts, queryid, the maximum query execution time at the current moment (used for monitoring).
Filling log_query and forming a history of execution plans.
The actual request text is taken from the log file. Log file from the target host to the monitoring host in parts, bash script, according to cron. To save space and due to the triviality of the task of copying a piece of a text file from host to host, the script is not provided.
Parsing a log file and extracting query text As a result, the table contains the actual request text, execution plans, hash value of the execution plan, hash value of the request text.
Fill in the queryid value in the history_pg_stat_activity table
update_history_pg_stat_activity_by_queryid.sql As a result, the table contains the queryid value corresponding to the queryid value of the query.
Total
By connecting pg_stat_activity, pg_stat_statements, log_query, you can get a lot of useful information about the request, in particular:
- History of execution plans.
- History of CPU-time request.
- Request waiting history.
Data and many additional reports will be described in the next article.
Development
By linking the available information with the history of the pg_locks view, you can get information about which specific locks the request was waiting for and most importantly, which process (request) was holding this lock.
The solution to this problem will be described in the next article. Now testing and refinement is underway.