presto_tuning

Presto tuning

Presto ์กฐ์ธ ๋ฐ ์ •๋ ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์„ ํƒ

  • Partitioned

    • image

      • Partitioned : ์ฟผ๋ฆฌ์— ์ฐธ์—ฌํ•˜๋Š” ๊ฐ ๋…ธ๋“œ๋Š” ๋ฐ์ดํ„ฐ์˜ ์ผ๋ถ€์—์„œ ํ•ด์‹œ ํ…Œ์ด๋ธ”์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

      • Presto์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๊ธฐ๋ณธ ์กฐ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ถ„์‚ฐ๋œ PARTITION ์กฐ์ธ์ž…๋‹ˆ๋‹ค. ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์กฐ์ธ ํ‚ค์˜ ํ•ด์‹œ ๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ ์™ผ์ชฝ ํ…Œ์ด๋ธ”๊ณผ ์˜ค๋ฅธ์ชฝ ํ…Œ์ด๋ธ”์„ ๋ชจ๋‘ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค. ๋ถ„ํ• ๋œ ์กฐ์ธ์€ ์ฟผ๋ฆฌ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ์ž‘์—…์ž ๋…ธ๋“œ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

      • ์ผ๋ฐ˜์ ์œผ๋กœ ๋” ๋น ๋ฅด๊ณ  ๋” ์ ์€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

      • -- SET SESSION join_distribution_type = 'PARTITIONED'

  • Broadcast

    • image

      • Broadcast : ์ฟผ๋ฆฌ์— ์ฐธ์—ฌํ•˜๋Š” ๊ฐ ๋…ธ๋“œ๋Š” ๋ชจ๋“  ๋ฐ์ดํ„ฐ์—์„œ ํ•ด์‹œ ํ…Œ์ด๋ธ”์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.(๋ฐ์ดํ„ฐ๋Š” ๊ฐ ๋…ธ๋“œ์— ๋ณต์ œ๋จ).

      • ์กฐ์ธ ํ…Œ์ด๋ธ” ์ค‘ ํ•˜๋‚˜๊ฐ€ ๋งค์šฐ ์ž‘์€ ์ผ๋ถ€ ๊ฒฝ์šฐ ๋ถ„์‚ฐ PARTITION ์กฐ์ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ์˜ค๋ฒ„ํ—ค๋“œ๊ฐ€ ์กฐ์ธ ์ž‘์—…์— ์ฐธ์—ฌํ•˜๋Š” ๋ชจ๋“  ๋…ธ๋“œ์— ์ „์ฒด ํ…Œ์ด๋ธ”์„ ๋ธŒ๋กœ๋“œ์บ์ŠคํŠธํ•˜๋Š” ์ด์ ์„ ์ดˆ๊ณผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ 'BROADCAST' ์กฐ์ธ์ด ๋” ์ž˜ ์ˆ˜ํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      • BROADCAST ์กฐ์ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์กฐ์ธ ์ ˆ์—์„œ ๋จผ์ € ํฐ ํ…Œ์ด๋ธ”์„ ์ง€์ •ํ•œ๋’ค์— ๋งค์ง ์ฝ”๋ฉ˜ํŠธ๋ฅผ ์ง€์ •ํ•˜์—ฌ BROADCAST ์กฐ์ธ์„ ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      • -- SET SESSION join_distribution_type = 'BROADCAST'

        • ์ด ์˜ต์…˜์€ ์˜ฌ๋ฐ”๋ฅธ ์กฐ์ธ ํ…Œ์ด๋ธ”์ด ๋ชจ๋“  ๋…ธ๋“œ์— ๋ณต์‚ฌ๋˜๋ฏ€๋กœ ๋” ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ์ •๋ ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜

    • ORDER BY ์ ˆ์ด ์žˆ๋Š” ์ฟผ๋ฆฌ์˜ ๊ฒฝ์šฐ Presto๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ถ„์‚ฐ ์ •๋ ฌ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ •๋ ฌ ์ž‘์—…์€ ์—ฌ๋Ÿฌ ๋…ธ๋“œ์—์„œ ๋ณ‘๋ ฌ๋กœ ์‹คํ–‰๋˜๋ฉฐ ๋‹จ์ผ ๋…ธ๋“œ๊ฐ€ ์ตœ์ข… ๊ฒฐ๊ณผ๋ฅผ ๋ณ‘ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

    • ๊ทธ๋Ÿฌ๋‚˜ ๋‹จ์ผ ๋…ธ๋“œ ์ •๋ ฌ ์ž‘์—…์ด ๋” ์ž˜ ์ˆ˜ํ–‰๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ ๋งค์ง ์ฝ”๋ฉ˜ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„์‚ฐ ์ •๋ ฌ์„ ๋น„ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

      • -- set distributed_sort=false

      • ๋ถ„์‚ฐ ์ •๋ ฌ์„ ๋น„ํ™œ์„ฑํ™”ํ•˜๋ฉด ๋‹จ์ผ ๋…ธ๋“œ์˜ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋” ๋งŽ์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ตœ๋Œ€ ๋ฉ”๋ชจ๋ฆฌ ์ œํ•œ์œผ๋กœ ์ธํ•ด ์ฟผ๋ฆฌ๊ฐ€ ์‹คํŒจํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Presto ํŠœ๋‹

  • CREATE TABLE AS SELECT ๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฟผ๋ฆฌ ๊ฒฐ๊ณผ ์ถœ๋ ฅ ํ”„๋กœ์„ธ์Šค๋ฅผ ๋ณ‘๋ ฌํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ฟผ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์ „์— ํ…Œ์ด๋ธ”์„ DROPํ•˜๋ฉด ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ํ–ฅ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ ์ถœ๋ ฅ ์„ฑ๋Šฅ์€ SELECT *๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค 5๋ฐฐ ๋น ๋ฆ…๋‹ˆ๋‹ค.

  • DROP TABLE์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  Presto๋Š” JSON ํ…์ŠคํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ฟผ๋ฆฌ ๊ฒฐ๊ณผ๋ฅผ ๊ตฌ์ฒดํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฒฐ๊ณผ ํ…Œ์ด๋ธ”์— 100GB์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ์œผ๋ฉด ์ฝ”๋””๋„ค์ดํ„ฐ๋Š” ์ฟผ๋ฆฌ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๊ธฐ ์œ„ํ•ด 100GB ์ด์ƒ์˜ JSON ํ…์ŠคํŠธ๋ฅผ ์ „์†กํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์งˆ์˜ ๊ณ„์‚ฐ์ด ๊ฑฐ์˜ ๋๋‚˜๋„ JSON ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๋ฐ ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค.

    • 1.์ฟผ๋ฆฌ ๋งจ ์œ„์— DROP TABLE ๋ฌธ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

    • 2.CREATE TABLE(ํ…Œ์ด๋ธ”) AS SELECT ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

    • DROP TABLE IF EXISTS my_result;
      CREATE TABLE my_result AS 
      SELECT * FROM my_table;
  • Presto๋Š” ๊ฐ ์ฟผ๋ฆฌ์˜ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ถ”์ ํ•˜๋Š”๋ฐ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ์ž‘์—… ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.

    • distinct

      • approx_distinct() ๊ฐ™์€ ๊ทผ์‚ฌ ์ง‘๊ณ„ ํ•จ์ˆ˜ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

      • COUNT(DISTINCT x)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹  approx_distinct(x)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋А๋ฆฐ ๋ฉ”๋ชจ๋ฆฌ ์†Œ๋ชจ ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ๋ง์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค

    • UNION

      • UNION ๋Œ€์‹ ์— UNION ALL์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

    • ORDER BY

      • ORDER BY์™€ ํ•จ๊ป˜ LIMIT ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

    • GROUP BY (of many columns)

      • GROUP BY ์ ˆ ๋‚ด์˜ ์นด๋””๋„๋ฆฌํ‹ฐ ๊ณ ๋ ค

        • GROUP BY ๋‚ด์˜ ํ•„๋“œ ๋ชฉ๋ก์„ ๋†’์€ ์นด๋””๋„๋ฆฌํ‹ฐ ์ˆœ์„œ๋กœ ์‹ ์ค‘ํ•˜๊ฒŒ ์ •๋ ฌํ•˜์—ฌ GROUP BY ๊ธฐ๋Šฅ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ

          • ์ข‹์€ ์ฟผ๋ฆฌ : SELECT GROUP BY uid, gender

          • ๋‚˜์œ ์ฟผ๋ฆฌ : SELECT GROUP BY gender, uid

        • GROUP BY ์—ด์— ๋ฌธ์ž์—ด ๋Œ€์‹  ์ˆซ์ž๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ˆซ์ž๋Š” ๋ฌธ์ž์—ด๋ณด๋‹ค ์ ์€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ•„์š”๋กœ ํ•˜๊ณ  ๋น„๊ต๊ฐ€ ๋” ๋น ๋ฅด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

    • joins

      • ํฐ ํ…Œ์ด๋ธ”๋ถ€ํ„ฐ ์ž‘์€ ํ…Œ์ด๋ธ” ์ˆœ์œผ๋กœ ํ…Œ์ด๋ธ”์„ ์กฐ์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

      • ๋™๋“ฑํ•˜์ง€ ์•Š์€ ์กฐ์ธ ์กฐ๊ฑด์„ ์‚ฌ์šฉํ•˜๋ฉด ์ฟผ๋ฆฌ ์ฒ˜๋ฆฌ ์†๋„๊ฐ€ ๋А๋ ค์ง‘๋‹ˆ๋‹ค.

      • Presto๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ ์กฐ์ธ์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ์™ผ์ชฝ ํ…Œ์ด๋ธ”์„ ์—ฌ๋Ÿฌ ์ž‘์—…์ž ๋…ธ๋“œ๋กœ ๋ถ„ํ• ํ•œ ๋‹ค์Œ ์˜ค๋ฅธ์ชฝ ํ…Œ์ด๋ธ”์˜ ์ „์ฒด ๋ณต์‚ฌ๋ณธ์„ ํŒŒํ‹ฐ์…˜์ด ์žˆ๋Š” ์ž‘์—…์ž ๋…ธ๋“œ๋กœ ๋ณด๋ƒ…๋‹ˆ๋‹ค. ์˜ค๋ฅธ์ชฝ ์‚ฌ์ด๋“œ ํ…Œ์ด๋ธ”์ด ํฌ๊ณ  ์ž‘์—…์ž ๋…ธ๋“œ์˜ ๋ฉ”๋ชจ๋ฆฌ์— ๋งž์ง€ ์•Š์œผ๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

      • ๋ถ„์‚ฐ ์กฐ์ธ ์‚ฌ์šฉ

        • ์ฟผ๋ฆฌ๊ฐ€ ์—ฌ์ „ํžˆ ์ž‘๋™ํ•˜์ง€ ์•Š์œผ๋ฉด ์„ธ์…˜ ์†์„ฑ์„ ์„ค์ •ํ•˜๋Š” ๋งค์ง ์ฃผ์„์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ถ„์‚ฐ ์กฐ์ธ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

        • DROP TABLE IF EXISTS my_result;
          -- set session join_distribution_type = 'PARTITIONED'
          CREATE TABLE my_result AS
          SELECT ... FROM large_table l, small_table s WHERE l.id = s.id
        • ๋ถ„์‚ฐ ์กฐ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์€ ์กฐ์ธ ํ‚ค์˜ ํ•ด์‹œ ๊ฐ’์„ ๋ถ„ํ•  ํ‚ค๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์™ผ์ชฝ ๋ฐ ์˜ค๋ฅธ์ชฝ ํ…Œ์ด๋ธ”์„ ๋ชจ๋‘ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค. ์˜ค๋ฅธ์ชฝ ์‚ฌ์ด๋“œ ํ…Œ์ด๋ธ”์ด ํฌ๋”๋ผ๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋„คํŠธ์›Œํฌ ๋ฐ์ดํ„ฐ ์ „์†ก ํšŸ์ˆ˜๋ฅผ ๋Š˜๋ฆด ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ธŒ๋กœ๋“œ์บ์ŠคํŠธ ์กฐ์ธ๋ณด๋‹ค ๋А๋ฆฝ๋‹ˆ๋‹ค.

  • ์—ด ์ €์žฅ ํŠน์„ฑ

    • ๋„ˆ๋ฌด ๋งŽ์€ ์—ด์„ ์„ ํƒํ•˜๋ฉด ์ฟผ๋ฆฌ ์ฒ˜๋ฆฌ ์†๋„๊ฐ€ ๋А๋ ค์ง‘๋‹ˆ๋‹ค.

  • ์ฟผ๋ฆฌ ๊ฒฐ๊ณผ ํฌ๊ธฐ

    • ๋„ˆ๋ฌด ๋งŽ์€ ํ–‰์„ ์ƒ์„ฑํ•˜๋ ค๋ฉด ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค. ๋Œ€์‹  CREATE TABLE AS... ๋˜๋Š” INSERT INTO ๋˜๋Š” result_output_redirect ๋งค์ง ์ฝ”๋ฉ˜ํŠธ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ํ•„์š”ํ•œ ์—ด ์ง€์ •

  • ์‹œ๊ฐ„ ๊ธฐ๋ฐ˜ ํŒŒํ‹ฐ์…”๋‹ ํ™œ์šฉ

  • ์ผ๋ จ์˜ LIKE ์ ˆ์„ ํ•˜๋‚˜์˜ ๋‹จ์ผ regexp_like ์ ˆ๋กœ ์ง‘๊ณ„

Reference

  • https://trino.io/Presto_SQL_on_Everything.pdf

  • https://api-docs.treasuredata.com/en/tools/presto/presto_performance_tuning/

Last updated