JUST DO IT!

SQL ํ•™์Šต - GROUP BY, CTAS (TIL 230510) ๋ณธ๋ฌธ

TIL

SQL ํ•™์Šต - GROUP BY, CTAS (TIL 230510)

sunhokimDev 2023. 5. 10. 16:26

๐Ÿ“š KDT WEEK 6 DAY 3 TIL

  • GROUP BY
  • CTAS

 


 

๐ŸŸฅ GROUP BY & Aggregate

ํ…Œ์ด๋ธ”์˜ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ทธ๋ฃนํ•‘(GROUP BY)ํ•˜์—ฌ ๊ทธ๋ฃน๋ณ„๋กœ ๋‹ค์–‘ํ•œ ์ •๋ณด๋ฅผ ๊ณ„์‚ฐ(Aggregate)ํ•œ๋‹ค

  1. ๊ทธ๋ฃนํ•‘ํ•  ํ•˜๋‚˜ ์ด์ƒ์˜ ํ•„๋“œ๋ฅผ ๊ฒฐ์ •ํ•œ๋‹ค.
  2. ๊ทธ๋ฃน๋ณ„๋กœ ๊ณ„์‚ฐํ•  ๋‚ด์šฉ์„ ๊ฒฐ์ •ํ•˜๊ณ  ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉ
    • COUNT, SUM, AVG, MIN, MAX, LISTAGG...

 

์˜ˆ์‹œ 1)

SELECT LEFT(ts,7) AS mon, COUNT(1) AS session_count
FROM raw_data.session_timestamp
GROUP BY 1 -- GROUP BY mon, GROUP BY LEFT(ts,7)
ORDER BY 1;

raw_data.session_timestamp ํ…Œ์ด๋ธ”์—์„œ ์›”๋ณ„ ์ด ์„ธ์…˜ ์ˆ˜๋ฅผ ๊ตฌํ•˜๋Š” ์ฟผ๋ฆฌ๋ฌธ์ด๋‹ค.

GROUP BY 1 ์„ ์‚ฌ์šฉํ•˜๋ฉด, 1์— ํ•ด๋‹นํ•˜๋Š” mon์ด ๊ทธ๋ฃนํ•‘๋œ๋‹ค.

๋”ฐ๋ผ์„œ ๊ทธ๋ฃนํ•‘๋œ mon๋ณ„๋กœ COUNT(1)์„ ํ†ตํ•ด ์ด ์„ธ์…˜์˜ ์ˆ˜๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ๊ฒŒ๋œ๋‹ค.

 

์˜ˆ์‹œ 2)

SELECT
    channel,
    COUNT(1) AS session_count,
    COUNT(DISTNICT userId) AS user_count
FROM raw_data.user_session_channel
GROUP BY 1	--GROUP BY channel
ORDER BY 2 DESC;	-- ORDER BY session_count DESC

raw_data.user_session_channel ํ…Œ์ด๋ธ”์—์„œ ์ฑ„๋„๋ณ„ ์„ธ์…˜ ์ˆ˜๋ฅผ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๊ตฌํ•˜๋Š” ์ฟผ๋ฆฌ๋ฌธ์ด๋‹ค.

 

์˜ˆ์‹œ 3)

SELECT
    TO_CHAR(A.ts, 'YYYY-MM') AS month,
    COUNT(DINTINCT B.userid) AS mau,
FROM raw_data.session_timestamp A
JOIN raw_data.user_session_channel B ON A.sessionid = B.sessionid
GROUP BY 1
ORDER BY 1 DESC;

๋‘ ํ…Œ์ด๋ธ”์„ ์กฐ์ธ(INNER JOIN)ํ•˜์—ฌ ์›”๋ณ„ ์œ ๋‹ˆํฌํ•œ ์‚ฌ์šฉ์ž ์ˆ˜๋ฅผ ๊ตฌํ•˜๋Š” ์ฟผ๋ฆฌ๋ฌธ์ด๋‹ค.

  • TO_CHAR์˜ ๊ฒฝ์šฐ non charํƒ€์ž…์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ด๋‹น ํ˜•ํƒœ(YYYY-MM)์˜ ๋ฌธ์ž์—ด๋กœ ๋ฐ”๊ฟ”์ฃผ๋Š” ์—ญํ• ์„ ํ•œ๋‹ค.
  • LEFT(A.ts, 7) ๊ณผ SUBSTRING(A.ts, 1, 7)์€ ๋ฐ›์€ ์ธ์ž๊ฐ’์— ๋”ฐ๋ผ ๋ฌธ์ž์—ด๋กœ ์ž˜๋ผ์ฃผ๋Š” ์—ญํ• ๋กœ์จ ๋Œ€์‹  ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค.
  • timestamp ํƒ€์ž…์„ ์œ ์ง€ํ•œ๋‹ค๋ฉด DATE_TRUNC('month', A.ts)์„ ์‚ฌ์šฉํ•œ๋‹ค.

 


 

๐ŸŸฆ CTAS

์ƒ์„ฑ๊ณผ ๋™์‹œ์— AS SELECT ๋ฌธ์„ ๋ถ™์—ฌ์„œ ๋ฐ์ดํ„ฐ๊นŒ์ง€ ๋™์‹œ์— ๋„ฃ์–ด์ฃผ๋Š” ๋ฐฉ๋ฒ•

์ž์ฃผ JOINํ•˜๋Š” ํ…Œ์ด๋ธ”์ด ์žˆ๋‹ค๋ฉด ์œ ์šฉํ•˜๋‹ค.

 

CREATE TABLE ํ…Œ์ด๋ธ”์ด๋ฆ„ AS SELECT ~

 

ex)

DROP TABLE IF EXISTS adhoc.test_session_summary; -- ์ด๋ฏธ ์กด์žฌํ•˜๋Š” ๊ฑด ์‚ญ์ œ
CREATE TABLE adhoc.test_session_summary AS
SELECT B.* , A.ts
FROM raw_data.session_timestamp A
JOIN raw_data.user_session_channel B ON A.sessionid = B.sessionid;

์ž„์‹œ๋กœ ์‚ฌ์šฉํ•  ํ…Œ์ด๋ธ”์ด๋ฏ€๋กœ raw_data ํด๋”๊ฐ€ ์•„๋‹Œ adhoc ํด๋”์— ์ƒ์„ฑํ•˜๋„๋ก ํ•œ๋‹ค.

AS SELECT ๊ตฌ๋ฌธ ๋ฐ‘์œผ๋กœ ๋‚˜์˜ค๋Š” ํ…Œ์ด๋ธ”์„ test_session_summary ์ด๋ฆ„์œผ๋กœ ์ƒˆ๋กœ์šด ํ…Œ์ด๋ธ” ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 


 

์ˆ™์ œ