PostgreSQL Table Partition 성능 비교 v9.6 vs v11 (+ 10)

작년에 PostgreSQL 11이 Release 되면서 많은 사항들이 개선되었는데, 필자에게 제일 와닿은 것은 Table Partition기능이 대폭 개선된 점이었다. v11에서 개선된 사항으로 인해 이전에 사용중이던 v9.6을 v11로 Upgearde할 계기가 마련되었다. 금방이라도 Upgrade를 적용하고 싶었지만, 이일 저일에 치여 미루고 있다가 이제 좀 해보려고 한다. 이 Posting은 Upgrade 타당성을 구성원들과 공유하기 위해 Survey 차원에서 실험한 내용을 정리한 것이다.

About Table Partitioning

요즘은 모르는 것들은 다 인터넷에서 찾아보기 때문에 보기가 힘들지만, 예전에 좀 산다는 친구 집에 놀러가면 거실에 백과사전 한 질씩 책장에 있는 경우가 많았다. 보통 20~30권 정도는 됐던거 같다. 이 책들을 보면 'ㄱ'으로 시작하는 단어는 1권~2권 중간까지, 'ㄴ'으로 시작하는 단어들은 2권 중간에서 3권까지... 이런 식으로 구성되어 있어서 자기가 찾으려는 낱말들의 시작 단어를 기준으로 몇 권에서 찾아야 할 지 파악해서 찾아보고, 각 권의 뒤에 있는 Index를 통해 항목을 보다 빠르게 찾을 수 있다. 만약 이 20~30권 되는 책들을 한 권으로 통합하면 어떻게 될까? 권당 300페이지라고 가정하면 우리는 6천~9천 페이지가 되는 책에서 우리가 원하는 정보를 찾아봐야 한다. 그리고 검색을 편리하기 위해 만든 Index도 매우 두껍게 되서 원하는 항목을 찾는데 보다 많은 시간이 소요하게 된다. 두 과정을 그림으로 표현하면 <pic 1>과 같다.

Table Partitioning은 (b)와 같이 한 권으로 구성된 백과사전을 일정 규칙에 의해 (a)와 같이 여러 권으로 나누는 것과 같은 개념이다. <pic 1>의 (b)와 같이 너무 커다란 Table에서 작업을 하는 것 보다는 (a)와 같이 적절히 나누어서 관리하다 특정 영역에서만 작업을 하는 것이 훨씬 빠르기 때문이다.

Table Partitioning via table inheritance (v9.6)

PostgreSQL 9.6에서는 직접적으로 Table Partitioning이 지원되지는 않는다. (비록 Documentation에 있기는 하지만... 사용자 측면에서 제대로 지원한다고 생각하지 않는다.) 하지만 몇 가지 기능을 조합하면 Table Partition을 비슷하게 운영될 수는 있다. 정의된 명칭은 없지만 v10부터 소개된 Declraritive Table Partition 기능에 대비해서 Inherit Table Partition이라고 한다. Inherit Table Partition은 아래 기능들을 통해 구현된다.

Table Inheritance
- OOP에서의 상속의 개념을 Table에서 적용하는 것이다. Parent Table의 모든 Column을 상속 받게되고, Not Null Constraint와 Check Constraint를 상속받게 된다.
- Unique Constraint, Primary Key, Foreign Key Constraint는 상속 받지 않는다.
Trigger
- Programming에서 Event를 생성해서 handling하는 것과 같은 개념이다. insert, update, delete Request를 실행하기 전에 다른 동작으로 변경하거나, 부수적인 동작을 추가하는 등으로 Handling 한다.

전체 Operation을 요약하자면 Table Schema를 정의하는 과정에서 Parent Table을 n개로 나눈 Child Table도 함께 정의하고, insert Operation에 trigger를 설정해서 insert되야할 Record가 특정 Child Table로 insert되게 구성하는 것이다.

Test 용 Table Schema 정의

Test를 위해 'star'라는 Parent Table을 생성하면서 이 Table의 Child Table인 'star00'~'star09'를 함께 생성한다. 각 Child Table들은 'star' Table의 PK인 id Column의 값을 10으로 나눠서 남는 나머지 값을 기준으로 특정 Child Table에 insert될 수 있게 Check Restriction을 정의했다. 또, Parent Table에서는 id column이 PK로 정의되었지만 이 내용은 Child Table에 상속되지 않기 때문에 Child Table에는 별도로 index를 생성해 주었다.(<code 1> 참고)

# 'star' Table 생성. Pk는 id column
CREATE TABLE public.star (
	id BIGSERIAL NOT NULL, 
	name VARCHAR(50) NOT NULL, 
	temperature INTEGER, 
	mass INTEGER, 
	radius INTEGER, 
	class VARCHAR(50), 
	pictureurl VARCHAR(255), 
	description VARCHAR(255), 
	createdtime TIMESTAMP(6) WITHOUT TIME ZONE, 
	updatedtime TIMESTAMP(6) WITHOUT TIME ZONE, 

CONSTRAINT STAR_PKEY PRIMARY KEY (id)
);

# star table에서 상속되는 star00~star09 table. 
# CHECK Constraint로 어떤 Data가 해당 Table에 저장되는 지 명시한다.
CREATE TABLE public.star00 (CHECK (mod(id,10) = 0) )INHERITS (public.star);
CREATE TABLE public.star01 (CHECK (mod(id,10) = 1) )INHERITS (public.star);
CREATE TABLE public.star02 (CHECK (mod(id,10) = 2) )INHERITS (public.star);
CREATE TABLE public.star03 (CHECK (mod(id,10) = 3) )INHERITS (public.star);
CREATE TABLE public.star04 (CHECK (mod(id,10) = 4) )INHERITS (public.star);
CREATE TABLE public.star05 (CHECK (mod(id,10) = 5) )INHERITS (public.star);
CREATE TABLE public.star06 (CHECK (mod(id,10) = 6) )INHERITS (public.star);
CREATE TABLE public.star07 (CHECK (mod(id,10) = 7) )INHERITS (public.star);
CREATE TABLE public.star08 (CHECK (mod(id,10) = 8) )INHERITS (public.star);
CREATE TABLE public.star09 (CHECK (mod(id,10) = 9) )INHERITS (public.star);

# Child Table에 Index를 설정한다.
CREATE INDEX ON public.star00 (id);
CREATE INDEX ON public.star01 (id);
CREATE INDEX ON public.star02 (id);
CREATE INDEX ON public.star03 (id);
CREATE INDEX ON public.star04 (id);
CREATE INDEX ON public.star05 (id);
CREATE INDEX ON public.star06 (id);
CREATE INDEX ON public.star07 (id);
CREATE INDEX ON public.star08 (id);
CREATE INDEX ON public.star09 (id);

<Code 1> Inherit Table Partition Schema 정의

Trigger 정의

PostgreSQL에서는 Data가 변경(Insert, Update, Delete) 전, 후에 수행되거나, 해당 Operation을 대체하는 Procedure 를 정의할 수 있고 이를 Trigger라고 한다. Inherit Table Partition에서는 Parent Table로 Insert operation이 실행될 때 발생하는 Event를 가로채서, Parent Table이 아닌 조건에 맞는 Child Table로 Insert되는 Operation으로 대체한다.

<code 2>는 'star' table에 Insert문이 수행되기 전에 star_insert_func()라는 function을 실행하라는 Trigger 이다.

$$ language plpgsql;

CREATE TRIGGER star_insert
	BEFORE INSERT ON star
	FOR EACH ROW EXECUTE PROCEDURE star_insert_func();

<Code 2> Trigger 정의

star_insert_func()는 'star' table에 추가하려는 각 Record의 id값을 10으로 나눈 나머지를 기준으로 'star00'~'star09' 중 하나의 child table을 선택해서 그 Table에 Insert하게 한다. 이 과정을 통해서 'star' table에 저장될 data는 'star00'~'star09'의 child table에 나뉘어서 저장되게 된다. <code 3>는 star_insert_func()의 구현 Code 이다.

CREATE OR REPLACE FUNCTION star_insert_func() RETURNS TRIGGER AS $$
BEGIN
	if (mod(new.id,10) = 1) THEN INSERT INTO star01 VALUES(new.*);
	elsif (mod(new.id,10) = 2) THEN INSERT INTO star02 VALUES(new.*);
	elsif (mod(new.id,10) = 3) THEN INSERT INTO star03 VALUES(new.*);
	elsif (mod(new.id,10) = 4) THEN INSERT INTO star04 VALUES(new.*);
	elsif (mod(new.id,10) = 5) THEN INSERT INTO star05 VALUES(new.*);
	elsif (mod(new.id,10) = 6) THEN INSERT INTO star06 VALUES(new.*);
	elsif (mod(new.id,10) = 7) THEN INSERT INTO star07 VALUES(new.*);
	elsif (mod(new.id,10) = 8) THEN INSERT INTO star08 VALUES(new.*);
	elsif (mod(new.id,10) = 9) THEN INSERT INTO star09 VALUES(new.*);
	else INSERT INTO star00 VALUES(new.*);
	end if;

	return null;
END;

<Code 3> Trigger에 의해 실행되는 Function 정의

Test Result

위와 같이 정의한 후에 id 값을 증가시키면서 2백만개의 Data를 'star' Table에 Insert 시켜보았다. 그 후에 각 Table의 Row Count를 확인해 보면 Child Table인 'star00'~'star09'에 아래와 같이 2십만개씩 나뉘어서 저장된 결과를 확인할 수 있다.

galaxy=# SELECT relname AS tableName, n_live_tup AS rowCount FROM pg_stat_user_tables WHERE relname LIKE 'star%';
 tablename | rowcount 
-----------+----------
 star      |        0
 star00    |   200000
 star01    |   200000
 star02    |   200000
 star03    |   200000
 star04    |   200000
 star05    |   200000
 star06    |   200000
 star07    |   200000
 star08    |   200000
 star09    |   200000
(11 rows)

그리고 위의 Data를 이용해서 몇 가지 Test를 했다.

Test 1) id값을 기준으로 한 단일 Record 검색 ;

'star' Table에서 SELECT Query를 실행할 때, Parent Table과 모든 Child Table을 검색한 후에 결과를 Merge하고 있다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE id=1003008;
                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1085.47 rows=11 width=714) (actual time=3.247..4.193 rows=1 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Append  (cost=0.00..84.38 rows=11 width=714) (actual time=0.194..0.230 rows=1 loops=1)
         ->  Seq Scan on star  (cost=0.00..0.00 rows=1 width=2100) (actual time=0.005..0.005 rows=0 loops=1)
               Filter: (id = 1003008)
         ->  Index Scan using star00_id_idx on star00  (cost=0.42..8.44 rows=1 width=575) (actual time=0.037..0.037 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star01_id_idx on star01  (cost=0.42..8.44 rows=1 width=575) (actual time=0.021..0.021 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star02_id_idx on star02  (cost=0.42..8.44 rows=1 width=575) (actual time=0.019..0.019 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star03_id_idx on star03  (cost=0.42..8.44 rows=1 width=575) (actual time=0.019..0.019 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star04_id_idx on star04  (cost=0.42..8.44 rows=1 width=575) (actual time=0.016..0.016 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star05_id_idx on star05  (cost=0.42..8.44 rows=1 width=575) (actual time=0.018..0.018 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star06_id_idx on star06  (cost=0.42..8.44 rows=1 width=575) (actual time=0.018..0.018 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star07_id_idx on star07  (cost=0.42..8.44 rows=1 width=575) (actual time=0.017..0.017 rows=0 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star08_id_idx on star08  (cost=0.42..8.44 rows=1 width=575) (actual time=0.023..0.023 rows=1 loops=1)
               Index Cond: (id = 1003008)
         ->  Index Scan using star09_id_idx on star09  (cost=0.42..8.44 rows=1 width=575) (actual time=0.035..0.035 rows=0 loops=1)
               Index Cond: (id = 1003008)
 Planning time: 0.965 ms
 Execution time: 4.315 ms
(29 rows)

Test 2) id값들의 List로 복수 Record 검색 ; 1과 동일하게 모든 Child Table을 Scan한 후에 Merge하고 있다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE id IN (1003003,1284006,192009);
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1176.23 rows=31 width=624) (actual time=3.417..4.476 rows=3 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Append  (cost=0.00..173.13 rows=31 width=624) (actual time=0.108..0.266 rows=3 loops=1)
         ->  Seq Scan on star  (cost=0.00..0.00 rows=1 width=2100) (actual time=0.006..0.006 rows=0 loops=1)
               Filter: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star00_id_idx on star00  (cost=0.42..17.31 rows=3 width=575) (actual time=0.029..0.029 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star01_id_idx on star01  (cost=0.42..17.31 rows=3 width=575) (actual time=0.023..0.023 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star02_id_idx on star02  (cost=0.42..17.31 rows=3 width=575) (actual time=0.024..0.024 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star03_id_idx on star03  (cost=0.42..17.31 rows=3 width=575) (actual time=0.026..0.033 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star04_id_idx on star04  (cost=0.42..17.31 rows=3 width=575) (actual time=0.020..0.020 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star05_id_idx on star05  (cost=0.42..17.31 rows=3 width=575) (actual time=0.019..0.019 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star06_id_idx on star06  (cost=0.42..17.31 rows=3 width=575) (actual time=0.025..0.025 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star07_id_idx on star07  (cost=0.42..17.31 rows=3 width=575) (actual time=0.019..0.019 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star08_id_idx on star08  (cost=0.42..17.31 rows=3 width=575) (actual time=0.018..0.018 rows=0 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
         ->  Index Scan using star09_id_idx on star09  (cost=0.42..17.31 rows=3 width=575) (actual time=0.021..0.049 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003003,1284006,192009}'::bigint[]))
 Planning time: 0.510 ms
 Execution time: 4.614 ms
(29 rows)

Test 3) 비 Index Column을 이용한 일정 Count의 Record 검색 ;

Child Table을 모두 Search하려고 하는 것은 동일하지만, 제일 처음 검색한 star00 table에서 10개의 Record가 추출되자 'star01'~'star09' table에서는 SELECT query가 실행되지 않고 있다. Partitioning을 하기 전과 상이한 결과가 나오는 경우이기 때문에 주의해야 한다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE radius BETWEEN 20 AND 30 limit 10;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1003.61 rows=10 width=575) (actual time=4.082..5.001 rows=10 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Limit  (cost=0.00..2.61 rows=10 width=575) (actual time=0.044..0.079 rows=10 loops=1)
         ->  Append  (cost=0.00..56842.25 rows=217541 width=575) (actual time=0.043..0.077 rows=10 loops=1)
               ->  Seq Scan on star  (cost=0.00..0.00 rows=1 width=2100) (actual time=0.005..0.005 rows=0 loops=1)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star00  (cost=0.00..5682.49 rows=21503 width=575) (actual time=0.038..0.054 rows=10 loops=1)
                     Filter: ((radius >= 20) AND (radius <= 30))
                     Rows Removed by Filter: 110
               ->  Seq Scan on star01  (cost=0.00..5675.74 rows=21685 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star02  (cost=0.00..5679.86 rows=21699 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star03  (cost=0.00..5689.77 rows=21845 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star04  (cost=0.00..5680.65 rows=21857 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star05  (cost=0.00..5691.49 rows=21543 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star06  (cost=0.00..5690.49 rows=21790 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star07  (cost=0.00..5681.97 rows=21860 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star08  (cost=0.00..5685.94 rows=21653 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star09  (cost=0.00..5683.84 rows=22105 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
 Planning time: 0.413 ms
 Execution time: 5.064 ms
(31 rows)

Test 4) Insert Test ; INSERT시에 Trigger가 발동되는 점 외에 특이 사항은 없다.

galaxy=# EXPLAIN ANALYZE INSERT INTO star(id, name, temperature,mass,radius,class,pictureurl,description) VALUES (8000001, 'virtual', 100, 1, 100, 'G', '', 'virtual star');
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Insert on star  (cost=0.00..0.01 rows=1 width=2100) (actual time=1.242..1.242 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=2100) (actual time=0.000..0.000 rows=1 loops=1)
 Planning time: 0.021 ms
 Trigger star_insert: time=0.379 calls=1
 Execution time: 1.253 ms
(5 rows)

Test 5) Update Test ;

UPDATE 실행할 때에도 모든 child table을 검사하면서 대상 Record를 찾은 후에 UPDATE를 실행한다.

galaxy=# EXPLAIN ANALYZE UPDATE star SET class='X' WHERE id='101010';
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Update on star  (cost=0.00..84.38 rows=11 width=1187) (actual time=0.150..0.150 rows=0 loops=1)
   Update on star
   Update on star00
   Update on star01
   Update on star02
   Update on star03
   Update on star04
   Update on star05
   Update on star06
   Update on star07
   Update on star08
   Update on star09
   ->  Seq Scan on star  (cost=0.00..0.00 rows=1 width=2106) (actual time=0.003..0.003 rows=0 loops=1)
         Filter: (id = '101010'::bigint)
   ->  Index Scan using star00_id_idx on star00  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.020..0.020 rows=1 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star01_id_idx on star01  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.011..0.011 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star02_id_idx on star02  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.008..0.008 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star03_id_idx on star03  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star04_id_idx on star04  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.006..0.006 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star05_id_idx on star05  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star06_id_idx on star06  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star07_id_idx on star07  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.006..0.006 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star08_id_idx on star08  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star09_id_idx on star09  (cost=0.42..8.44 rows=1 width=1095) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
 Planning time: 0.574 ms
 Execution time: 0.302 ms
(36 rows)

Inherit Table Partition의 한계

위에서도 확인할 수 있었겠지만 v9.6에서의 Table Partition 기능 사용에는 다음과 한계가 있다.

Data Consistency(일관성)를 직접 관리해야 한다. 만약 CHECK 문을 잘못 작성하게 되면 두 개 이상의 Child Table에 중복된 Data가 존재하게 되고 이는 추후 문제를 야기시킬 가능성이 높다.
Insert를 수행할 때, Child Table로 나눠주는 역할을 하는 Trigger가 반드시 필요하고 이로 인해 시간 지연이 발생한다.
Child Table들을 유지 관리하는 데 많은 수작업이 필요하고, 흔하지 않게 Re-partitioning이 필요한 경우와 같이 매우 큰 작업이 필요한 경우 대응을 쉽게 할 수 없다.
Index, constraint와 같은 Table 관리 목적의 Command들이 Child Table에 상속되지 않기 때문에 명시적으로 정의해줘야 한다.
DB opertaion이 모든 Child Table을 대상으로 수행하게 되고, 전체 성능이 저하된다.
모든 Child Table을 대상으로 Query가 수행하게 되는 과정에서, 조회 결과에 왜곡이 발생할 수 있다.

Declarative Table Partitioning (v10 or later)

PostgreSQL 10부터는 Table Partition 관련 기능이 많이 강화되었다. 가지고 있는 기능들을 잘 엮어서 마치 Table Partition을 구현한 것과 같이 보이는 꼼수(?)가 아니라, 명시적으로 'Declarative Table Partioning'라는 이름으로 기능을 정의하고 있다. 이 포스팅에서는 v11에서 사용한 경험을 바탕으로 정리했다. (v10에서 지원되지 않았던 Hash Partitioning 기능 때문에 v10은 고려하지 않았고, 참고를 위해 포스팅 마지막 부분에 조금 언급했다.)

위의 예에서 사용했던 Schema로 Parent Table과 Partition Table을 정의하면 <code 4>와 같다

# 'star' Table을 정의하고 Partitioning된다는 것을 명시
CREATE TABLE public.star (
	id BIGSERIAL NOT NULL, 
	name VARCHAR(50) NOT NULL, 
	temperature INTEGER, 
	mass INTEGER, 
	radius INTEGER, 
	class VARCHAR(50), 
	pictureurl VARCHAR(255), 
	description VARCHAR(255), 
	createdtime TIMESTAMP(6) WITHOUT TIME ZONE, 
	updatedtime TIMESTAMP(6) WITHOUT TIME ZONE, 

CONSTRAINT STAR_PKEY PRIMARY KEY (id)
) PARTITION BY HASH(id);

# 각 Partition Table을 정의하고 어떤 기준으로 Partition 되는 지 명시
CREATE TABLE public.star00 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 0);
CREATE TABLE public.star01 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 1);
CREATE TABLE public.star02 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 2);
CREATE TABLE public.star03 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 3);
CREATE TABLE public.star04 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 4);
CREATE TABLE public.star05 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 5);
CREATE TABLE public.star06 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 6);
CREATE TABLE public.star07 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 7);
CREATE TABLE public.star08 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 8);
CREATE TABLE public.star09 PARTITION OF public.star FOR VALUES WITH (MODULUS 10, REMAINDER 9);

<Code 4> Declarative Table Partition 정의

Declarative Partition에서는 'PARTITION BY' 와 'PARTITION OF', 2 가지 Keyword를 살펴봐야 한다.

PARTITION BY ; Partitioning할 대상 Table을 명시하며, 어떤 방식으로 Paritioning을 할 지 정의한다.

PARTITION BY [RANGE | LIST | HASH]

[ PARTITION BY { RANGE | LIST | HASH } ( { column_name | ( expression ) }

Range Partitioning : 특정 Column이 가지는 값들의 범위로 Partition을 나눈다. 예를 들면 date값을 기준으로 매달 1일에서 말일까지 범위로 Partition을 나누는 경우이다.
List Partitioning : 특정 Column이 가지는 값들의 집합으로 Partition을 나눈다. 예를 들면 location 값을 기준으로 서울, 부산, 인천 등 Data 값들의 List를 기준으로 Partition을 나누는 경우이다.
Hash Partitioning : Partition을 나눌 명확한 방법이 없을 경우 유용하다. 특히 향후 거대해질 것으로 예상되는데 초기단계에서 Partition을 설정하기 어려울 때 더욱 유용하다. 이 포스팅에서는 자동으로 증가하는 id 값을 기준으로 Partition을 정의했다.

PARTITION OF : Partition table을 생성한다. 어떤 Table이 Partition Table인지 명시하고 'FOR VALUES WITH' keyword로 포함할 Record의 조건을 명시하거나, 'DEFAULT' keyword로 Default Partition Table 여부를 정의한다.

PARTITION OF parent_table { FOR VALUES partition_bound_spec | DEFAULT }

partition_bound_spec

IN ( { in_condition } [, ...] )

| FROM ( { range_condition } [, ...] ) TO ( { range_condition } [, ...] )

| WITH ( MODULUS numeric_literal, REMAINDER numeric_literal )

in_condition : numeric_literal | string_literal | TRUE | FALSE | NULL

range_condition : numeric_literal | string_literal | TRUE | FALSE | MINVALUE | MAXVALUE

위의 언급한 Syntax 중에서 partition_bound_spec을 좀 더 설명하면 FROM~TO는 Range Partitioning시에, IN은 List Partitioning을 할 때, WITH는 Hash Partitioning을 할 때 사용하는 Keyword이다. 각 Syntax의 예를 위에 제시한 Table을 이용해서 아래와 같이 정리했으니 참고하시면 된다.

PARTITION BY RANGE(id) : id 범위를 기준으로 한 RANGE Partition
- partition_bound_spec 설정 예 : FOR VALUES FROM (1) to (1000)
PARTITION BY LIST(class) : class column 기준으로 한 LIST Partition
- partition_bound_spec 설정 예 : FOR VALUES IN ('G', 'V')
PARTITION BY HASH(id) : id column을 기준으로 한 HASH Partition
- partition_bound_spec 설정 예 : FOR VALUES WIH (MODULUS 10, REMAINDER 5)

Test Result

v9.6에서와 동일한 방식으로 2백만개의 Data를 추가했을 때와 동일하게 2백만개의 Record를 추가해보고 각 Table에 어느정도 Record가 입력되었는지 확인해 보았다. 이전과 달리 Parent Table은 실제 Record를 가지고 있지 않고 Wrapper의 역할만 하는 것처럼 보여진다. Row Count가 각 Table마다 20만건 내외이기는 하나 약간씩 차이를 보이는 것도 참고하자.

galaxy=# SELECT relname AS tableName, n_live_tup AS rowCount FROM pg_stat_user_tables WHERE relname LIKE 'star%';
 tablename | rowcount 
-----------+----------
 star00    |   199609
 star01    |   200622
 star02    |   199861
 star03    |   200096
 star04    |   199670
 star05    |   199611
 star06    |   200367
 star07    |   199670
 star08    |   199844
 star09    |   200649
(10 rows)

galaxy=# SELECT COUNT(*) FROM star;
  count  
---------
 2000000
(1 row)

galaxy=# SELECT COUNT(*) FROM star00;
 count  
--------
 199609
(1 row)

Test 1) id값을 기준으로 한 단일 Record 검색 ;

이전과 달리 해당 Child Table에서만 Index Search를 하고 있다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE id=1003008;
                                                            QUERY PLAN                                                            
----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.42..1008.54 rows=1 width=575) (actual time=2.974..3.929 rows=1 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Append  (cost=0.42..8.44 rows=1 width=575) (actual time=0.036..0.037 rows=1 loops=1)
         ->  Index Scan using star06_pkey on star06  (cost=0.42..8.44 rows=1 width=575) (actual time=0.036..0.036 rows=1 loops=1)
               Index Cond: (id = 1003008)
 Planning Time: 0.248 ms
 Execution Time: 3.983 ms
(9 rows)

Test 2) id값들의 List로 복수 Record 검색 ;

1과 동일하게 해당하는 Child Table들을 대상으로 Scan한 후에 Merge하고 있다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE id IN (1003000, 1284000, 192001);
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.42..1052.88 rows=9 width=575) (actual time=3.008..4.238 rows=3 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Append  (cost=0.42..51.98 rows=9 width=575) (actual time=0.046..0.269 rows=3 loops=1)
         ->  Index Scan using star01_pkey on star01  (cost=0.42..17.31 rows=3 width=575) (actual time=0.045..0.061 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003000,1284000,192001}'::bigint[]))
         ->  Index Scan using star06_pkey on star06  (cost=0.42..17.31 rows=3 width=575) (actual time=0.072..0.081 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003000,1284000,192001}'::bigint[]))
         ->  Index Scan using star09_pkey on star09  (cost=0.42..17.31 rows=3 width=575) (actual time=0.126..0.126 rows=1 loops=1)
               Index Cond: (id = ANY ('{1003000,1284000,192001}'::bigint[]))
 Planning Time: 0.326 ms
 Execution Time: 4.256 ms
(13 rows)

Test 3) 비 Index Column을 이용한 일정 Count의 Record 검색 ;

이전 Test와 동일하게 제일 처음 검색한 'star00' table에서 10개의 Record가 추출되자 'star01'~'star09' table에서는 SELECT query가 실행되지 않고 있다. Parent Table에 대한 scan이 이루어 지지 않는 다는 점과, 이전과 다르게 sequence scan으로 수행하고 있다.

galaxy=# EXPLAIN ANALYZE SELECT * FROM star WHERE radius BETWEEN 20 AND 30 limit 10;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1003.57 rows=10 width=575) (actual time=3.241..4.556 rows=10 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   ->  Limit  (cost=0.00..2.57 rows=10 width=575) (actual time=0.020..0.039 rows=10 loops=1)
         ->  Append  (cost=0.00..56740.11 rows=220370 width=575) (actual time=0.019..0.036 rows=10 loops=1)
               ->  Seq Scan on star00  (cost=0.00..5534.34 rows=22208 width=575) (actual time=0.019..0.035 rows=10 loops=1)
                     Filter: ((radius >= 20) AND (radius <= 30))
                     Rows Removed by Filter: 134
               ->  Seq Scan on star01  (cost=0.00..5595.17 rows=22157 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star02  (cost=0.00..5542.52 rows=22172 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star03  (cost=0.00..5572.99 rows=21870 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star04  (cost=0.00..5552.20 rows=21583 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star05  (cost=0.00..5546.81 rows=21626 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star06  (cost=0.00..5577.41 rows=21924 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star07  (cost=0.00..5560.62 rows=22136 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star08  (cost=0.00..5554.90 rows=21996 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
               ->  Seq Scan on star09  (cost=0.00..5601.30 rows=22698 width=575) (never executed)
                     Filter: ((radius >= 20) AND (radius <= 30))
 Planning Time: 0.603 ms
 Execution Time: 4.586 ms
(29 rows)

Test 4) Insert Test ; INSERT시에는 별다른 특이 사항은 없다.

galaxy=# EXPLAIN ANALYZE INSERT INTO star(id, name, temperature,mass,radius,class,pictureurl,description) VALUES (8000001, 'virtual', 100, 1, 100, 'G', '', 'virtual star');
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Insert on star  (cost=0.00..0.01 rows=1 width=1304) (actual time=0.051..0.051 rows=0 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=1304) (actual time=0.001..0.002 rows=1 loops=1)
 Planning Time: 0.046 ms
 Execution Time: 0.101 ms
(4 rows)

Test 5) Update Test ;

UPDATE 실행할 때에는 모든 child table을 검사하면서 대상 Record를 찾은 후에 UPDATE를 실행한다.

galaxy=# EXPLAIN ANALYZE UPDATE star SET class='X' WHERE id='101010';
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Update on star  (cost=0.42..84.38 rows=10 width=697) (actual time=0.124..0.124 rows=0 loops=1)
   Update on star00
   Update on star01
   Update on star02
   Update on star03
   Update on star04
   Update on star05
   Update on star06
   Update on star07
   Update on star08
   Update on star09
   ->  Index Scan using star00_pkey on star00  (cost=0.42..8.44 rows=1 width=697) (actual time=0.013..0.013 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star01_pkey on star01  (cost=0.42..8.44 rows=1 width=697) (actual time=0.005..0.005 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star02_pkey on star02  (cost=0.42..8.44 rows=1 width=697) (actual time=0.009..0.009 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star03_pkey on star03  (cost=0.42..8.44 rows=1 width=697) (actual time=0.009..0.009 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star04_pkey on star04  (cost=0.42..8.44 rows=1 width=697) (actual time=0.009..0.009 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star05_pkey on star05  (cost=0.42..8.44 rows=1 width=697) (actual time=0.009..0.009 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star06_pkey on star06  (cost=0.42..8.44 rows=1 width=697) (actual time=0.006..0.006 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star07_pkey on star07  (cost=0.42..8.44 rows=1 width=697) (actual time=0.008..0.008 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star08_pkey on star08  (cost=0.42..8.44 rows=1 width=697) (actual time=0.008..0.008 rows=0 loops=1)
         Index Cond: (id = '101010'::bigint)
   ->  Index Scan using star09_pkey on star09  (cost=0.42..8.44 rows=1 width=697) (actual time=0.009..0.010 rows=1 loops=1)
         Index Cond: (id = '101010'::bigint)
 Planning Time: 0.405 ms
 Execution Time: 0.171 ms
(33 rows)

v11에서 개선된 사항

v10에서 Declaritive Table Partition 기능이 처음 발표되기는 했지만 사용이 조금 편리해진 점 외에는 제약 사항이 많았었고, v11에서 와서야 기능이 대폭으로 개선되었다. 요약하면 아래와 같다. (필자의 경우에는 1번 항목이 Version Upgrade를 결정하게된 Key Point 였다)

Hash Partition 지원. v10에서는 특정 Column의 Data 범위로 나뉘는 Range Partition과 Value List로 나뉘는 List Partition만 지원했다.
Query 실행 시 상관없는 Partition을 제외하고 Query Plan을 세우면서 모든 Command에 대해 성능이 향상되었다.
Update query에 의한 Child Table간의 Record 이동이 가능해졌다.
Default child table을 정의할 수 있어, 조건에 맞는 Partition Table이 반드시 존재해야 하는 제약 조건이 없어지게 됐다.
Parent Table에 정의된 Index는 자동으로 Child Table에 적용되며, Child Table에서 Foreign Key를 사용할 수 있다.
Child Table에서 Unique Index를 사용할 수 있다. (단, Unique Index는 Partition Key를 포함하고 있어야 한다.)

Table Partition 기능에 한해서 평가하자면, PostgreSQL v11에 와서야 비로서 쓸만한 수준이 된 것으로 보이며, 많은 사람이 이런 내용을 정리해서 Posting했었다. 아울러 PostgreSQL 공식 홈페이지의 wiki에서는 현재까지 개선된 점과 Limitation을 정리해놓은 내용이 있다. (Link)

성능 비교

	9.6	9.6 Partitioning	11 Partitioning
SELECT single record by Index	Planning time: 0.113 ms Execution time: 3.244 ms	Planning time: 0.965 ms Execution time: 4.315 ms	Planning Time: 0.248 ms Execution Time: 3.983 ms
SELECT multiple records by Index	Planning time: 0.094 ms Execution time: 3.218 ms	Planning time: 0.510 ms Execution time: 4.614 ms	Planning Time: 0.326 ms Execution Time: 4.256 ms
Select records by non-index	Planning time: 0.044 ms Execution time: 2.896 ms	Planning time: 0.413 ms Execution time: 5.064 ms	Planning Time: 0.603 ms Execution Time: 4.586 ms
Insert	Planning time: 0.050 ms Execution time: 0.113 ms	Planning time: 0.021 ms Execution time: 1.253 ms	Planning Time: 0.046 ms Execution Time: 0.101 ms
Update	Planning Time: 0.071 ms Execution Time: 107.692 ms	Planning time: 0.574 ms Execution time: 0.302 ms	Planning Time: 0.405 ms Execution Time: 0.171 ms
<Table 1> Table Partition 적용/미적용 결과 비교

200만 Record를 가진 상태에서 위의 Test를 수행하고 결과를 비교했을 때 눈에 띄는 점은 다음과 같다.

v11에서의 Table Partition 성능이 v9.6에 비해 전체적으로 향상되었다. 특히 Insert 속도는 매우 향상된 점이 눈에 띈다.
200 만개 Record 수준에서는 Table Partitioning에 의한 SELECT query의 성능 향상은 찾기 어렵다.
Update는 Partitioning을 했을 때가 훨씬 유리하고, 특히 v11에서 그 효과가 더 컸다.

Table Partition을 적용할 지 여부는 뚜렷한 기준은 없다. 위의 예를 기준으로 생각하면 만약 Table에 Update가 빈번하게 이루어진다면 Table Partition을 수행해야 하고, Insert 후에 단순 조회만 많은 경우라면 굳이 할 이유가 없다. 이건 각자의 상황에 따라 결정이 달라지기 때문에 어느 정도의 Test 후에 결정해야 한다.

PostgreSQL 10에서의 Declarative Table Partitioning

v10에서 Table Partioning 기능을 간단하게 보면 아래와 같다. Partition을 나누는 기준이 Range와 List로 제한되어 있기 때문에 사전에 입력되는 값을 예상해서 Partition을 나눠줘야 한다. (v10에서는 Default Partition을 지원하지 않기 때문에 모든 Partition의 합집합에 빈 틈이 있어서는 안된다.)

가령 위의 Test예와 같은 경우를 가정하면 미리 2백만개의 Record가 입력될 것을 예상해서 20만개 단위로 Partition Table을 정의해야만 한다. <code 5>에서는 id값을 기준으로 20만개씩 Partition을 나눈 예이다.

# 'star' Table 명시
CREATE TABLE public.star (
	id BIGSERIAL NOT NULL, 
	name VARCHAR(50) NOT NULL, 
	temperature INTEGER, 
	mass INTEGER, 
	radius INTEGER, 
	class VARCHAR(50), 
	pictureurl VARCHAR(255), 
	description VARCHAR(255), 
	createdtime TIMESTAMP(6) WITHOUT TIME ZONE, 
	updatedtime TIMESTAMP(6) WITHOUT TIME ZONE

) PARTITION BY RANGE(id);

# id 범위를 기준으로 Child Table들을 생성
CREATE TABLE public.star00 PARTITION OF public.star FOR VALUES FROM (1) TO (200001);
CREATE INDEX ON public.star00 (id);
CREATE TABLE public.star01 PARTITION OF public.star FOR VALUES FROM (200001) TO (400001);
CREATE INDEX ON public.star01 (id);
CREATE TABLE public.star02 PARTITION OF public.star FOR VALUES FROM (400001) TO (600001);
CREATE INDEX ON public.star02 (id);
CREATE TABLE public.star03 PARTITION OF public.star FOR VALUES FROM (600001) TO (800001);
CREATE INDEX ON public.star03 (id);
CREATE TABLE public.star04 PARTITION OF public.star FOR VALUES FROM (800001) TO (1000001);
CREATE INDEX ON public.star04 (id);
CREATE TABLE public.star05 PARTITION OF public.star FOR VALUES FROM (1000001) TO (1200001);
CREATE INDEX ON public.star05 (id);
CREATE TABLE public.star06 PARTITION OF public.star FOR VALUES FROM (1200001) TO (1400001);
CREATE INDEX ON public.star06 (id);
CREATE TABLE public.star07 PARTITION OF public.star FOR VALUES FROM (1400001) TO (1600001);
CREATE INDEX ON public.star07 (id);
CREATE TABLE public.star08 PARTITION OF public.star FOR VALUES FROM (1600001) TO (1800001);
CREATE INDEX ON public.star08 (id);
CREATE TABLE public.star09 PARTITION OF public.star FOR VALUES FROM (1800001) TO (2000001);
CREATE INDEX ON public.star09 (id);

<Code 5> v10에서의 Declarative Table Partion 예

위와 같이 정의하면 id 값이 2,000,000인 Record까지 10개의 Partition에 나뉘어서 입력되게 된다. 만약 2,000,001 번째 Record가 추가가 되면, 적절한 Partition을 찾지 못했다고 Error가 발생되게 된다. 이를 해결하려면 사용자가 수동으로 새로운 Partition을 정의해 줘야 하는데, 이런 점이 Hash Partitioning과 Default Partition 설정을 지원 못해서 생기는 문제이고, 필자가 v10을 고려하지 않았던 이유이다.

'Database > PostgreSQL' 카테고리의 다른 글

pgbench를 통한 Postgresql 성능 분석 (0)	2019.11.08
알아두면 유용한 psql 명령어 정리 (4)	2019.10.30
PostgreSQL Log로 DB 성능 분석 (feat. pgBadger) (2)	2019.06.03
PostgreSQL Log 설정 - 2 (10)	2019.05.10
PostgreSQL DB Backup 및 Restore (3)	2018.12.31

未星의 기술 학습장

PostgreSQL Table Partition 성능 비교 v9.6 vs v11 (+ 10)

About Table Partitioning

Table Partitioning via table inheritance (v9.6)

Declarative Table Partitioning (v10 or later)

성능 비교

PostgreSQL 10에서의 Declarative Table Partitioning

'Database > PostgreSQL' 카테고리의 다른 글

티스토리툴바

PostgreSQL Table Partition 성능 비교 v9.6 vs v11 (+ 10)

About Table Partitioning

Table Partitioning via table inheritance (v9.6)

Declarative Table Partitioning (v10 or later)

성능 비교

PostgreSQL 10에서의 Declarative Table Partitioning

'Database > PostgreSQL' 카테고리의 다른 글

'Database/PostgreSQL' Related Articles

티스토리툴바