PostgreSQLã®ã€ã³ããã¯ã¹ã¡ã«ããºã ã
ã¢ã¯ã»ã¹æ¹æ³ã®ã€ã³ã¿ãŒãã§ã€ã¹ ãããã³
ããã·ã¥ã€ã³ããã¯ã¹ ã
BããªãŒ ã
GiST ã
SP-GiST ã
GINãªã©ã®ãã¹ãŠã®åºæ¬çãªã¢ã¯ã»ã¹æ¹æ³ã«ã€ããŠã¯æ¢ã«æ€èš
ã ãŸãã ã ãããŠããã®ããŒãã§ã¯ããžã³ãã©ã é
ã«å€ããããšãèŠãŠã¿ãŸãããã
ã©ã
èè
ã¯ãžã³ã¯åŒ·åãªç²Ÿç¥ã§ãããšäž»åŒµããŠãããã飲æã®è©±é¡ã¯äŸç¶ãšããŠåã¡ç¶ããŠãããæ¬¡äžä»£ã®GINã¯RUMãšåŒã°ããŠããã
ãã®ã¢ã¯ã»ã¹æ¹æ³ã¯ãGINã«çµã¿èŸŒãŸããã¢ã€ãã¢ãéçºããå
šææ€çŽ¢ãããã«é«éã«å®è¡ã§ããããã«ããŸãã ããã¯ãæšæºã®PostgreSQLããã±ãŒãžã®äžéšã§ã¯ãªãããµãŒãããŒãã£ã®æ¡åŒµæ©èœã§ãããã®èšäºã·ãªãŒãºã®å¯äžã®æ¹æ³ã§ãã ã€ã³ã¹ããŒã«ã«ã¯ããã€ãã®ãªãã·ã§ã³ããããŸãã
- PGDââGãªããžããªããyumãŸãã¯aptããã±ãŒãžãååŸããŸã ã ããšãã°ãpostgresql-10ããã±ãŒãžããPostgreSQLãã€ã³ã¹ããŒã«ããå Žåãpostgresql-10-rumã
- githubã®ãœãŒã¹ã³ãŒãããèªå·±ã¢ã»ã³ãã«ããŠã€ã³ã¹ããŒã«ããŸã ïŒåãå Žæã«ããæé ïŒã
- Postgres Pro Enterpriseã®äžéšãšããŠäœ¿çšããŸãïŒãŸãã¯ãå°ãªããšãããããããã¥ã¡ã³ããèªãã§ãã ããïŒã
GINã®å¶é
RUMãå
æã§ããGINã€ã³ããã¯ã¹ã®å¶éã¯äœã§ããïŒ
ãŸããtsvectorããŒã¿åã«ã¯ãããŒã¯ã³èªäœã«å ããŠãããã¥ã¡ã³ãå
ã§ã®äœçœ®ã«é¢ããæ
å ±ãå«ãŸããŠããŸãã GINã€ã³ããã¯ã¹ã§ã¯ã
ååèŠ
ãããã« ããã®æ
å ±ã¯ä¿åãããŸããã ãã®ãããããŒãžã§ã³9.6
ã§ç»å Žãã
ãã¬ãŒãºæ€çŽ¢æäœ
㯠ãGINã€ã³ããã¯ã¹ã«ãã£ãŠéå¹ççã«åŠçãããæ€èšŒã®ããã«ãœãŒã¹ããŒã¿ã«ã¢ã¯ã»ã¹ããå¿
èŠããããŸãã
第äºã«ãæ€çŽ¢ãšã³ãžã³ã¯éåžžãé¢é£æ§ã®ããé ã«çµæãè¿ããŸãïŒãããäœã§ããïŒã ãããè¡ãã«ã¯ã
ã©ã³ãã³ã°é¢æ°ts_rankããã³ts_rank_cdã䜿çšã§ããŸãããçµæã®è¡ããšã«èšç®ããå¿
èŠããããŸããããã¡ããé
ãã§ãã
æåã®è¿äŒŒã§ã¯ãRUMã¢ã¯ã»ã¹æ¹æ³ã¯GINãšèŠãªãããšãã§ããäœçœ®æ
å ±ã远å ãããç®çã®é åºã§çµæã®åºåããµããŒãããŸãïŒ
GiSTãæè¿åãçºè¡ããæ¹æ³ãšåæ§ïŒã é çªã«è¡ããŸãããã
ãã¬ãŒãºæ€çŽ¢
å
šææ€çŽ¢ã®ã¯ãšãªã«ã¯ãããŒã¯ã³éã®è·é¢ãèæ
®ããç¹å¥ãªæ§é ãå«ãŸããå ŽåããããŸãã ããšãã°ãç¥æ¯ãšç¥ç¶ãå¥ã®åèªã§åºåã£ãææžãèŠã€ããããšãã§ããŸãã
postgres=# select to_tsvector(' , ...') @@
to_tsquery(' <2> ');
?column?
----------
t
(1 row)
ãŸãã¯ãåèªãäºãã®åŸãã«ç«ã€ããšã瀺ããŸãã
postgres=# select to_tsvector(' , ...') @@
to_tsquery(' <-> ');
?column?
----------
t
(1 row)
éåžžã®GINã€ã³ããã¯ã¹ã¯äž¡æ¹ã®ããŒã¯ã³ãå«ãããã¥ã¡ã³ããçæã§ããŸãããtsvectorãèŠãã ãã§ãããã®éã®è·é¢ã確èªã§ããŸãã
postgres=# select to_tsvector(' , ...');
to_tsvector
------------------------------
'':1 '':3,4 '':6
(1 row)
RUMã€ã³ããã¯ã¹ã§ã¯ãåããŒã¯ã³ã¯ããŒãã«è¡ãåç
§ããã ãã§ã¯ãããŸãããåTIDãšãšãã«ãããã¥ã¡ã³ãå
ã§ããŒã¯ã³ã衚瀺ãããäœçœ®ã®ãªã¹ãããããŸãã 以äžã¯ãçœbiã§æ¢ã«éŠŽæã¿ã®ããããŒãã«ã«äœæãããã€ã³ããã¯ã¹ãæ³åããæ¹æ³ã§ãïŒããã©ã«ãã§ã¯ãtsvectorã«rum_tsvector_opsæŒç®åã¯ã©ã¹ã䜿çšãããŸãïŒã
postgres=# create extension rum;
CREATE EXTENSION
postgres=# create index on ts using rum(doc_tsv);
CREATE INDEX

å³ã®ç°è²ã®åè§-äœçœ®æ
å ±ã远å ããŸããïŒ
postgres=# select ctid, doc, doc_tsv from ts;
ctid | doc | doc_tsv
--------+-------------------------+--------------------------------
(0,1) | | '':3 '':2 '':4
(0,2) | | '':3 '':2 '':4
(0,3) | , , | '':1,2 '':3
(0,4) | , , | '':1,2 '':3
(1,1) | | '':2 '':3 '':1
(1,2) | | '':3 '':2 '':1
(1,3) | , , | '':3 '':1,2
(1,4) | , , | '':3 '':1,2
(2,1) | | '':3 '':2
(2,2) | | '':1 '':2 '':3
(2,3) | , , | '':3 '':1,2
(2,4) | , , | '':3 '':1,2
(12 rows)
fastupdateãã©ã¡ãŒã¿ãŒãæå®ãããšãGINã«é
å»¶æ¿å
¥ããŸã ãããŸãã RUMã¯ãã®æ©èœãåé€ããŸããã
ã€ã³ããã¯ã¹ãå®éã®ããŒã¿ã§ã©ã®ããã«æ©èœãããã確èªããã«ã¯ãç¥ã£ãŠããpgsql-hackersã¡ãŒãªã³ã°ãªã¹ã
ã¢ãŒã«ã€ãã䜿çšããŸãã
fts=# alter table mail_messages add column tsv tsvector;
ALTER TABLE
fts=# set default_text_search_config = default;
SET
fts=# update mail_messages
set tsv = to_tsvector(body_plain);
...
UPDATE 356125
GINã€ã³ããã¯ã¹ã䜿çšããŠããã¬ãŒãºæ€çŽ¢ã䜿çšããã¯ãšãªãå®è¡ããæ¹æ³ã¯æ¬¡ã®ãšããã§ãã
fts=# create index tsv_gin on mail_messages using gin(tsv);
CREATE INDEX
fts=# explain (costs off, analyze)
select * from mail_messages where tsv @@ to_tsquery('hello <-> hackers');
QUERY PLAN
---------------------------------------------------------------------------------
Bitmap Heap Scan on mail_messages (actual time=2.490..18.088 rows=259 loops=1)
Recheck Cond: (tsv @@ to_tsquery('hello <-> hackers'::text))
Rows Removed by Index Recheck: 1517
Heap Blocks: exact=1503
-> Bitmap Index Scan on tsv_gin (actual time=2.204..2.204 rows=1776 loops=1)
Index Cond: (tsv @@ to_tsquery('hello <-> hackers'::text))
Planning time: 0.266 ms
Execution time: 18.151 ms
(8 rows)
èšç»ãããããããã«ãGINã€ã³ããã¯ã¹ã䜿çšãããŸããã1776ã®æœåšçãªäžèŽãè¿ããããã®ãã¡259ãæ®ãã1517ã¯åãã§ãã¯ã®æ®µéã§ç Žæ£ãããŸãã
GINã€ã³ããã¯ã¹ãåé€ããŠãRUMãäœæããŸãã
fts=# drop index tsv_gin;
DROP INDEX
fts=# create index tsv_rum on mail_messages using rum(tsv);
CREATE INDEX
ããã§ãã€ã³ããã¯ã¹ã«å¿
èŠãªãã¹ãŠã®æ
å ±ãå«ãŸããæ€çŽ¢ãæ£ç¢ºã«å®è¡ãããŸãã
fts=# explain (costs off, analyze)
select * from mail_messages
where tsv @@ to_tsquery('hello <-> hackers');
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on mail_messages (actual time=2.798..3.015 rows=259 loops=1)
Recheck Cond: (tsv @@ to_tsquery('hello <-> hackers'::text))
Heap Blocks: exact=250
-> Bitmap Index Scan on tsv_rum (actual time=2.768..2.768 rows=259 loops=1)
Index Cond: (tsv @@ to_tsquery('hello <-> hackers'::text))
Planning time: 0.245 ms
Execution time: 3.053 ms
(7 rows)
é¢é£æ§ã®äžŠã¹æ¿ã
ããã¥ã¡ã³ããæ£ããé åºã§ããã«çºè¡ããããã«ãRUMã€ã³ããã¯ã¹ã¯
é åºæŒç®åããµããŒãããŠã
ãŸããããã«ã€ããŠã¯ã
GiSTã«é¢ããéšåã§èª¬æããŸããã ã©ã æ¡åŒµåã¯ãããã¥ã¡ã³ãïŒtsvectorïŒãšã¯ãšãªïŒtsqueryïŒéã®ç¹å®ã®è·é¢ãè¿ãæŒç®å
<=>
å®çŸ©ããŸãã äŸïŒ
fts=# select to_tsvector(' , ...') <=> to_tsquery('');
?column?
----------
16.4493
(1 row)
fts=# select to_tsvector(' , ...') <=> to_tsquery('');
?column?
----------
13.1595
(1 row)
ãã®ææžã¯ã2çªç®ã®èŠæ±ãããæåã®èŠæ±ã«é¢é£ããŠããããšã倿ããŸãããææžã«åèªãé »ç¹ã«çŸããã»ã©ãã䟡å€ã®ããããã®ã§ã¯ãªããªããŸãã
ç¹°ãè¿ããŸãããæ¯èŒç倧éã®ããŒã¿ã§GINãšRUMãæ¯èŒããŠã¿ãŠãã ããããhelloããšãhackersããå«ãæãé¢é£æ§ã®é«ã10åã®ããã¥ã¡ã³ããéžæããŸãã
fts=# explain (costs off, analyze)
select * from mail_messages
where tsv @@ to_tsquery('hello & hackers')
order by ts_rank(tsv,to_tsquery('hello & hackers'))
limit 10;
QUERY PLAN
---------------------------------------------------------------------------------------------
Limit (actual time=27.076..27.078 rows=10 loops=1)
-> Sort (actual time=27.075..27.076 rows=10 loops=1)
Sort Key: (ts_rank(tsv, to_tsquery('hello & hackers'::text)))
Sort Method: top-N heapsort Memory: 29kB
-> Bitmap Heap Scan on mail_messages (actual ... rows=1776 loops=1)
Recheck Cond: (tsv @@ to_tsquery('hello & hackers'::text))
Heap Blocks: exact=1503
-> Bitmap Index Scan on tsv_gin (actual ... rows=1776 loops=1)
Index Cond: (tsv @@ to_tsquery('hello & hackers'::text))
Planning time: 0.276 ms
Execution time: 27.121 ms
(11 rows)
GINã€ã³ããã¯ã¹ã¯1776ä»¶ã®äžèŽãè¿ããŸãããããã¯åå¥ã«ãœãŒããããŠãæãé©åãª10åãéžæãããŸãã
RUMã€ã³ããã¯ã¹ã§ã¯ãã¯ãšãªã¯åçŽãªã€ã³ããã¯ã¹ã¹ãã£ã³ã«ãã£ãŠå®è¡ãããŸããäœåãªããã¥ã¡ã³ãã¯ã¹ãã£ã³ããããåå¥ã®äžŠã¹æ¿ãã¯å¿
èŠãããŸããã
fts=# explain (costs off, analyze)
select * from mail_messages
where tsv @@ to_tsquery('hello & hackers')
order by tsv <=> to_tsquery('hello & hackers')
limit 10;
QUERY PLAN
--------------------------------------------------------------------------------------------
Limit (actual time=5.083..5.171 rows=10 loops=1)
-> Index Scan using tsv_rum on mail_messages (actual ... rows=10 loops=1)
Index Cond: (tsv @@ to_tsquery('hello & hackers'::text))
Order By: (tsv <=> to_tsquery('hello & hackers'::text))
Planning time: 0.244 ms
Execution time: 5.207 ms
(6 rows)
è¿œå æ
å ±
GINãšåæ§ã«ãRUMã€ã³ããã¯ã¹ã¯ããã€ãã®ãã£ãŒã«ãã§æ§ç¯ã§ããŸãã ãã ããç°ãªãåã®GINããŒã¯ã³ãäºãã«ç¬ç«ããŠæ ŒçŽãããŠããå ŽåãRUMã䜿çšãããšãã¡ã€ã³ãã£ãŒã«ãïŒãã®å Žåã¯tsvectorïŒã远å ã®ãã£ãŒã«ããšãæ¥ç¶ãã§ããŸãã ãããè¡ãã«ã¯ãç¹å¥ãªrum_tsvector_addon_opsæŒç®åã¯ã©ã¹ã䜿çšããŸãã
fts=# create index on mail_messages using rum(tsv rum_tsvector_addon_ops, sent)
with (attach='sent', to='tsv');
CREATE INDEX
ãã®ãããªã€ã³ããã¯ã¹ã䜿çšããŠã远å ãã£ãŒã«ãã«ãããœãŒãé ã§çµæã衚瀺ã§ããŸãã
fts=# select id, sent, sent <=> '2017-01-01 15:00:00'
from mail_messages
where tsv @@ to_tsquery('hello')
order by sent <=> '2017-01-01 15:00:00'
limit 10;
id | sent | ?column?
---------+---------------------+----------
2298548 | 2017-01-01 15:03:22 | 202
2298547 | 2017-01-01 14:53:13 | 407
2298545 | 2017-01-01 13:28:12 | 5508
2298554 | 2017-01-01 18:30:45 | 12645
2298530 | 2016-12-31 20:28:48 | 66672
2298587 | 2017-01-02 12:39:26 | 77966
2298588 | 2017-01-02 12:43:22 | 78202
2298597 | 2017-01-02 13:48:02 | 82082
2298606 | 2017-01-02 15:50:50 | 89450
2298628 | 2017-01-02 18:55:49 | 100549
(10 rows)
ããã§ã¯ãæå®ãããæ¥ä»ã«ã§ããã ãè¿ãå Žæã«ããé©åãªè¡ãæ¢ããŸãããé
ããæ©ããé¢ä¿ãããŸããã å³å¯ã«æ¥ä»ã«å
è¡ããïŒãŸãã¯åŸç¶ããïŒçµæãååŸããã«ã¯ãæäœ
<=|
ã䜿çšããå¿
èŠããããŸã
<=|
ïŒãŸãã¯
|=>
ïŒã
äºæ³ã©ãããã¯ãšãªã¯åçŽãªã€ã³ããã¯ã¹ã¹ãã£ã³ã«ãã£ãŠå®è¡ãããŸãã
ts=# explain (costs off)
select id, sent, sent <=> '2017-01-01 15:00:00'
from mail_messages
where tsv @@ to_tsquery('hello')
order by sent <=> '2017-01-01 15:00:00'
limit 10;
QUERY PLAN
---------------------------------------------------------------------------------
Limit
-> Index Scan using mail_messages_tsv_sent_idx on mail_messages
Index Cond: (tsv @@ to_tsquery('hello'::text))
Order By: (sent <=> '2017-01-01 15:00:00'::timestamp without time zone)
(4 rows)
ãã£ãŒã«ãã®é¢ä¿ã«é¢ããè¿œå æ
å ±ãªãã§ã€ã³ããã¯ã¹ãäœæããå Žåãåæ§ã®ã¯ãšãªã§ã¯ãã€ã³ããã¯ã¹ããåãåã£ããã¹ãŠã®çµæããœãŒãããå¿
èŠããããŸãã
ãã¡ãããæ¥ä»ã«å ããŠããã£ãŒã«ããä»ã®ããŒã¿åãRUMã€ã³ããã¯ã¹ã«è¿œå ã§ããŸã-ã»ãšãã©ãã¹ãŠã®åºæ¬çãªåããµããŒããããŠããŸãã ããšãã°ããªã³ã©ã€ã³ã¹ãã¢ã§ã¯ãããã«ãã£ïŒæ¥ä»ïŒãäŸ¡æ ŒïŒæ°å€ïŒã人æ°ãŸãã¯å²åŒãµã€ãºïŒæŽæ°ãŸãã¯æµ®åå°æ°ç¹ïŒã§è£œåããã°ãã衚瀺ã§ããŸãã
ãã®ä»ã®æŒç®åã¯ã©ã¹
å®å
šãæãããã«ãä»ã®å©çšå¯èœãªæŒç®åã®ã¯ã©ã¹ã«ã€ããŠèšåãã䟡å€ããããŸãã
rum_tsvector_hash_opsãš
rum_tsvector_hash_addon_opsããå§ããŸããã
ã ãã¹ãŠã®ç¹ã§ããããã¯ãã§ã«äžèšã§æ€èšããrum_tsvector_opsããã³rum_tsvector_addon_opsãšäŒŒãŠããŸãããããŒã¯ã³èªäœã§ã¯ãªãããã®ããã·ã¥ã³ãŒãã¯ã€ã³ããã¯ã¹ã«æ ŒçŽãããŸãã ããã«ãããã€ã³ããã¯ã¹ã®ãµã€ãºãå°ããããããšãã§ããŸããããã¡ãããæ€çŽ¢ã®ç²ŸåºŠãäœäžããäºéãã§ãã¯ãå¿
èŠã«ãªããŸãã ããã«ãã€ã³ããã¯ã¹ã¯éšåäžèŽã®æ€çŽ¢ããµããŒãããªããªããŸããã
rum_tsquery_opsæŒç®åã¯ã©ã¹ã¯å¥œå¥å¿
ãçã§ãã ãéãåé¡ã解決ããããšãã§ããŸãïŒããã¥ã¡ã³ãã«äžèŽããã¯ãšãªãæ€çŽ¢ããŸãã ãªããããå¿
èŠãªã®ã§ããããïŒ ããšãã°ããŠãŒã¶ãŒããã£ã«ã¿ãŒã§æ°ãã補åã«ãµãã¹ã¯ã©ã€ãããŸãã ãŸãã¯ãæ°ããããã¥ã¡ã³ããèªåçã«åé¡ããŸãã 以äžã«ç°¡åãªäŸã瀺ããŸãã
fts=# create table categories(query tsquery, category text);
CREATE TABLE
fts=# insert into categories values
(to_tsquery('vacuum | autovacuum | freeze'), 'vacuum'),
(to_tsquery('xmin | xmax | snapshot | isolation'), 'mvcc'),
(to_tsquery('wal | (write & ahead & log) | durability'), 'wal');
INSERT 0 3
fts=# create index on categories using rum(query);
CREATE INDEX
fts=# select array_agg(category)
from categories
where to_tsvector(
'Hello hackers, the attached patch greatly improves performance of tuple
freezing and also reduces size of generated write-ahead logs.'
) @@ query;
array_agg
--------------
{vacuum,wal}
(1 row)
æŒç®åã¯ã©ã¹
rum_anyarray_opsãš
rum_anyarray_addon_opsã¯æ®ããŸã -ãããã¯tsvectorã§ã¯ãªãé
åã§åäœããããã«èšèšãããŠããŸãã GINã®å Žåãããã¯æ¢ã«
æåŸãšèŠãªãããŠãããããç¹°ãè¿ãçç±ã¯ãããŸããã
ã€ã³ããã¯ã¹ãšäºåèšé²ã®ãã°ãµã€ãº
RUMã«ã¯GINãããå€ãã®æ
å ±ãå«ãŸããŠãããããããå€ãã®ã¹ããŒã¹ãå æããããšã¯æããã§ãã ååãããŸããŸãªã€ã³ããã¯ã¹ã®ãµã€ãºãæ¯èŒããŸããã ãã®ããŒãã«ãšRUMã«è¿œå ããŸãã
rum | gin | gist | btree
--------+--------+--------+--------
457 MB | 179 MB | 125 MB | 546 MB
ã芧ã®ãšãããããªã¥ãŒã ã倧å¹
ã«å¢å ããŠããŸã-ããã¯ã¯ã€ãã¯æ€çŽ¢ã®æéã§ãã
泚æãå¿
èŠãªãã1ã€ã®æãããªç¹ã¯ãRUMã¯æ¡åŒµæ©èœã§ãããšããããšã§ããã€ãŸããã·ã¹ãã ã®ã«ãŒãã«ã«å€æŽãå ããã«ã€ã³ã¹ããŒã«ããããšãã§ããŸãã ããã¯ãããŒãžã§ã³9.6ã§
Alexander Korotkovãäœæãããããã®ãããã§å¯èœã«ãªããŸããã 解決ããå¿
èŠããã£ãã¿ã¹ã¯ã®1ã€ã¯ããžã£ãŒãã«ãšã³ããªã®çæã§ããã ãžã£ãŒããªã³ã°ã¡ã«ããºã ã¯çµ¶å¯Ÿã«ä¿¡é Œã§ãããã®ã§ãªããã°ãªããªãããããã®ãããã³ãžã®æ¡åŒµã¯èš±å¯ãããŸããã æ¡åŒµæ©èœãç¬èªã®ã¿ã€ãã®ãžã£ãŒãã«ãšã³ããªãäœæã§ããããã«ãã代ããã«ã次ã®ããšãè¡ãããŸãïŒæ¡åŒµã³ãŒãã¯ãããŒãžã倿Žããæå³ãéç¥ãã倿Žãå ããŠå®äºãéç¥ããã·ã¹ãã ã®ã«ãŒãã«ã¯ããŒãžã®å€ãããŒãžã§ã³ãšæ°ããããŒãžã§ã³ããã§ã«æ¯èŒããå¿
èŠãªçµ±åããããžã£ãŒãã«ãçæããŸãã¬ã³ãŒãã
çŸåšã®çæã¢ã«ãŽãªãºã ã¯ãããŒãžããã€ãåäœã§æ¯èŒãã倿Žããããã©ã°ã¡ã³ããèŠã€ããŠãããŒãžã®å
é ããã®ãªãã»ãããšãšãã«ãã®ãããªåãã©ã°ã¡ã³ããèšé²ããŸãã ããã¯ãæ°ãã€ãã®ã¿ã倿Žããå Žåãããã³ããŒãžãå®å
šã«å€æŽãããå Žåã«ããŸãæ©èœããŸãã ãã ããæ®ãã®ã³ã³ãã³ããäžã«ç§»åããŠïŒãŸãã¯éã«ã³ã³ãã³ããäžã«ç§»åããŠãã©ã°ã¡ã³ããåé€ããŠïŒããŒãžå
ã«ãã©ã°ã¡ã³ãã远å ãããšãå®éã«è¿œå ãŸãã¯åé€ããããã€ãæ°ãããããªãå€ãã®ãã€ããæ£åŒã«å€æŽãããŸãã
ãã®ãããRUMã€ã³ããã¯ã¹ãç©æ¥µçã«å€æŽãããšãGINïŒæ¡åŒµæ©èœã§ã¯ãªããã«ãŒãã«ã®äžéšã§ãããžã£ãŒãã«èªäœã管çããïŒããã倧å¹
ã«å€§ãããµã€ãºã®ãžã£ãŒãã«ãšã³ããªãçæã§ããŸãã ãã®äžå¿«ãªå¹æã®çšåºŠã¯å®éã®è² è·ã«å€§ããäŸåããŸãããäœããã®åé¡ãæããããã«ãç¹å®ã®è¡ãæ°ååé€ããŠè¿œå ãããããã®ã¢ã¯ã·ã§ã³ãã¯ãªãŒãã³ã°ïŒç空ïŒãšäº€äºã«è©ŠããŠã¿ãŸãããã ãã°ãšã³ããªã®ãµã€ãºã¯æ¬¡ã®ããã«èŠç©ããããšãã§ããŸããæåãšæåŸã«ãpg_current_wal_location颿°ïŒæå€§10ããŒãžã§ã³-pg_current_xlog_locationïŒã䜿çšããŠãã°ã®äœçœ®ãèšæ¶ãããã®éãã確èªããŸãã
ããã§ã¯ããã¡ãããå€ãã®èŠå ã«çæããå¿
èŠããããŸãã 1人ã®ãŠãŒã¶ãŒã®ã¿ãã·ã¹ãã ã§äœæ¥ããŠããããšã確èªããå¿
èŠããããŸããããããªããšããäœåãªããšã³ããªãèæ
®ãããŸãã ãã®å Žåã§ããRUMã ãã§ãªããããŒãã«èªäœãšäž»ããŒããµããŒãããã€ã³ããã¯ã¹ã®å€æŽãèæ
®ããŸãã æ§æãã©ã¡ãŒã¿ãŒã®å€ã圱é¿ããŸãïŒããã§ã¯ãå§çž®ãªãã§ã¬ããªã«ãã°ã¬ãã«ã䜿çšããŸããïŒã ãããããŸã 詊ããŠã¿ãŠãã ããã
fts=# select pg_current_wal_location() as start_lsn \gset
fts=# insert into mail_messages(parent_id, sent, subject, author, body_plain, tsv)
select parent_id, sent, subject, author, body_plain, tsv
from mail_messages where id % 100 = 0;
INSERT 0 3576
fts=# delete from mail_messages where id % 100 = 99;
DELETE 3590
fts=# vacuum mail_messages;
VACUUM
fts=# insert into mail_messages(parent_id, sent, subject, author, body_plain, tsv)
select parent_id, sent, subject, author, body_plain, tsv
from mail_messages where id % 100 = 1;
INSERT 0 3605
fts=# delete from mail_messages where id % 100 = 98;
DELETE 3637
fts=# vacuum mail_messages;
VACUUM
fts=# insert into mail_messages(parent_id, sent, subject, author, body_plain, tsv)
select parent_id, sent, subject, author, body_plain, tsv from mail_messages
where id % 100 = 2;
INSERT 0 3625
fts=# delete from mail_messages where id % 100 = 97;
DELETE 3668
fts=# vacuum mail_messages;
VACUUM
fts=# select pg_current_wal_location() as end_lsn \gset
fts=# select pg_size_pretty(:'end_lsn'::pg_lsn - :'start_lsn'::pg_lsn);
pg_size_pretty
----------------
3114 MB
(1 row)
ãã®ãããçŽ3 GBã«ãªããŸããã GINã€ã³ããã¯ã¹ã䜿çšããŠåãå®éšãç¹°ãè¿ããå ŽåãçŽ700 MBãããããŸããã
ãããã£ãŠãdiffãŠãŒãã£ãªãã£ã®åäœãšåæ§ã«ãããããŒãžã®ç¶æ
ãå¥ã®ããŒãžã®ç¶æ
ã«ç§»åã§ããæå°æ°ã®æ¿å
¥ããã³å逿äœãèŠã€ããå¥ã®ã¢ã«ãŽãªãºã ãå¿
èŠã§ãã ãã®ãããªã¢ã«ãŽãªãºã ã¯ãã§ã«
Oleg Ivanovã«ãã£ãŠå®è£
ãããŠããã圌ã®
ãããã¯è°è«
ãããŠããŸãã äžèšã®äŸã§ã¯ããã®ãããã¯ããããªé床äœäžãç ç²ã«ããŠããžã£ãŒãã«ãšã³ããªã®ããªã¥ãŒã ã1.5åã1900 MBã«æžããããšãã§ããŸãã
ããããã£
äŒçµ±çã«ãç§ãã¡ã¯ginãšã®éãã«æ³šæãæã£ãŠãã©ã ã¢ã¯ã»ã¹ã¡ãœããã®ããããã£ã調ã¹ãŸãïŒãªã¯ãšã¹ã
ã¯ä»¥åã«äžããããŸãã ïŒã
ã¡ãœããã®ããããã£ïŒ
amname | name | pg_indexam_has_property
--------+---------------+-------------------------
rum | can_order | f
rum | can_unique | f
rum | can_multi_col | t
rum | can_exclude | t -- f gin
ã€ã³ããã¯ã¹ããããã£ïŒ
name | pg_index_has_property
---------------+-----------------------
clusterable | f
index_scan | t -- f gin
bitmap_scan | t
backward_scan | f
RUMã¯GINãšã¯ç°ãªããã€ã³ããã¯ã¹ã¹ãã£ã³ããµããŒãããŠããããšã«æ³šæããŠãã ããããããªããšããã¬ãŒãºå¶éã®ããã¯ãšãªã§å¿
èŠãªæ°ã®çµæãæ£ç¢ºã«ååŸã§ããŸããã ãããã£ãŠãgin_fuzzy_search_limitãã©ã¡ãŒã¿ãŒã®é¡äŒŒç©ã¯å¿
èŠãããŸããã ãã®çµæãã€ã³ããã¯ã¹ã䜿çšããŠé€å€ã®å¶éããµããŒãã§ããŸãã
åã¬ãã«ã®ããããã£ïŒ
name | pg_index_column_has_property
--------------------+------------------------------
asc | f
desc | f
nulls_first | f
nulls_last | f
orderable | f
distance_orderable | t -- f gin
returnable | f
search_array | f
search_nulls | f
ããã§ã®éãã¯ãRUMãç
§åæŒç®åããµããŒãããŠããããšã§ãã ãã¹ãŠã®æŒç®åã¯ã©ã¹ã§ã¯ãããŸããããããšãã°ãtsquery_opsã®å Žåã¯falseã«ãªããŸãã
ç¶ç¶ãã ã