๐ŸŒฑ Infra 42

Amazon Linux2 ์— ํƒ€๋ธ”๋กœ(Tableau) ์„ค์น˜ํ•˜๊ธฐ

์•ˆ๋…•ํ•˜์„ธ์š” ๐Ÿ™‹‍โ™€๏ธ๐Ÿ™‹ ์˜ค๋Š˜์€ ๊ฐ€์žฅ ์œ ๋ช…ํ•˜๊ณ  ๊ฐ€์žฅ ํŒŒ์›Œํ”Œํ•œ BIํˆด์ธ, ํƒ€๋ธ”๋กœ(Tableau)๋ฅผ ์„ค์น˜ํ•˜๊ณ  ๊ธฐ๋Šฅ์„ ์‚ดํŽด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋จผ์ €, ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ํƒ€๋ธ”๋กœ๋ฅผ ์„ค์น˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ํƒ€๋ธ”๋กœ๋Š” ์ตœ์†Œ 4์ฝ”์–ด CPU(AWS vCPU 8๊ฐœ์— ํ•ด๋‹น) ๋ฐ 16 GB RAM์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ๋•์…˜์ธ ๊ฒฝ์šฐ, 8๊ฐœ CPU ์ฝ”์–ด(16๊ฐœ AWS vCPU) ๋ฐ 64GB RAM์„ ์‚ฌ์šฉํ•  ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค! # ๊ฐœ๋ฐœ, ํ…Œ์ŠคํŠธ ๋ฐ ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์ธ์Šคํ„ด์Šค ํฌ๊ธฐ c5.4xlarge m5.4xlarge > ์ €๋Š” m5.4xlarge๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค๐Ÿค— r5.4xlarge EC2 ์ธ์Šคํ„ด์Šค ์œ ํ˜•์— ๋”ฐ๋ฅธ Tableau ์†๋„ ๋น„๊ต ์ž๋ฃŒ (ํด๋ฆญ!) ๊ทธ๋Ÿฌ๋ฉด ์„ค์น˜ํ•ด๋ด…์‹œ๋‹ค! 1. EC2 ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ & ์ ‘์† ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ๋Š” ๋ณ„๋„..

[Spark 3.1] Spark ์„œ๋น„์Šค ํฌํŠธ ๋ฐ ์„ค์ • ์ดํ•ดํ•˜๊ณ  ๋„˜์–ด๊ฐ€๊ธฐ!

Spark ํ”„๋กœ์„ธ์Šค ํ™•์ธํ•˜๊ธฐ 1) Master Node ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ์—์„œ๋Š” "Master"๋ผ๋Š” ํ”„๋กœ์„ธ์Šค๊ฐ€ ์‹คํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 2) Worker 01~ 03 Node ์›Œ์ปค๋…ธ๋“œ์—์„œ๋Š” ""Worker"๋ผ๋Š” ํ”„๋กœ์„ธ์Šค๊ฐ€ ์‹คํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. (Worker 01~03 ๋ชจ๋‘ ๋™์ผํ•˜๋ฏ€๋กœ, ์บก์ณ๋Š” Worker01๋งŒ ๋„ฃ์—ˆ์Šต๋‹ˆ๋‹ค!) Spark Service Port ์„œ๋ฒ„ ํฌํŠธ ํŒŒ๋ผ๋ฉ”ํ„ฐ ์„ค์ •ํŒŒ์ผ ์„ค๋ช… Master 7077 SPARK_MASTER_PORT ${spark_home}/conf/spark-env.sh Spark Master Port 8080 SPARK_MASTER_WEBUI_PORT Master Web UI Worker01~03 8081 SPARK_WORKER_WEBUI_PORT Worker Web UI {Random ..

[Hadoop 3.3] YARN ์„œ๋น„์Šค ํฌํŠธ ๋ฐ ์„ค์ • ์ดํ•ดํ•˜๊ณ  ๋„˜์–ด๊ฐ€๊ธฐ!

YARN ํ”„๋กœ์„ธ์Šค ํ™•์ธํ•˜๊ธฐ 1) Master Node Master ๋…ธ๋“œ์—๋Š” YARN์ด ์šด์˜๋˜๊ธฐ ์œ„ํ•œ ๋ฆฌ์†Œ์Šค ๋งค๋‹ˆ์ € ํ”„๋กœ์„ธ์Šค๊ฐ€ ์šด์˜๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 2) Worker01~03 Node Worker 01~ 03 ๋…ธ๋“œ์—๋Š” ๋ชจ๋‘ ๋…ธ๋“œ ๋งค๋‹ˆ์ € ํ”„๋กœ์„ธ์Šค๊ฐ€ ์šด์˜๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. YARN Service Port YARN ์˜ ์„œ๋น„์Šค ํฌํŠธ์™€ ํŒŒ๋ผ๋ฉ”ํ„ฐ๋ฅผ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์„œ๋ฒ„ ํฌํŠธ ํ”„๋กœํ† ์ฝœ ํŒŒ๋ผ๋ฉ”ํ„ฐ ์„ค๋ช… Master 8088 http yarn.resourcemanager.webapp.address ๋ฆฌ์†Œ์Šค ๋งค๋‹ˆ์ € ์›น UI 8030 http yarn.resourcemanager.scheduler.address ์Šค์ผ€์ค„๋Ÿฌ ์ธํ„ฐํŽ˜์ด์Šค 8031 http yarn.resourcemanager.resource-tracker.address YA..

[Hadoop 3.3] HDFS ์„œ๋น„์Šค ํฌํŠธ ๋ฐ ์„ค์ • ์ดํ•ดํ•˜๊ณ  ๋„˜์–ด๊ฐ€๊ธฐ!

HDFS ํ”„๋กœ์„ธ์Šค ํ™•์ธํ•˜๊ธฐ 1) Master Node ํ˜„์žฌ Master ์„œ๋ฒ„์—์„œ๋Š” NameNode ํ”„๋กœ์Šค์„ธ๋งŒ ์šด์˜๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์Šคํ„ฐ ๋…ธ๋“œ๋Š” 8020, 9870 ์ด๋ ‡๊ฒŒ ๋‘๊ฐœ์˜ ํฌํŠธ๋ฅผ LISTEN์ค‘์ž…๋‹ˆ๋‹ค. 2) Worker01 Node (Secondary NameNode + DataNode) ํ˜„์žฌ Worker01 ์„œ๋ฒ„์—์„œ๋Š” ์„ธ์ปจ๋”๋ฆฌ ๋„ค์ž„๋…ธ๋“œ ํ”„๋กœ์„ธ์Šค์™€, ๋ฐ์ดํ„ฐ๋…ธ๋“œ ํ”„๋กœ์Šค์„ธ๊ฐ€ ์šด์˜์ค‘์ž…๋‹ˆ๋‹ค. 3) Worker02 Node (DataNode) ํ˜„์žฌ Worker 02 ์„œ๋ฒ„์—์„œ๋Š” ๋ฐ์ดํ„ฐ๋…ธ๋“œ ํ”„๋กœ์„ธ์Šค ์šด์˜์ค‘์ž…๋‹ˆ๋‹ค. 4) Worker03 Node (DataNode) ํ˜„์žฌ Worker 03 ์„œ๋ฒ„์—์„œ๋Š” ๋ฐ์ดํ„ฐ๋…ธ๋“œ ํ”„๋กœ์„ธ์Šค ์šด์˜์ค‘์ž…๋‹ˆ๋‹ค. HDFS Service Port HDFS์˜ ์„œ๋น„์Šค ํฌํŠธ์™€ ํŒŒ๋ผ๋ฉ”ํ„ฐ๋ฅผ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ..

[๋ฒˆ์™ธ#1 ]Hadoop HDFS(3.3)+Spark(3.1.1)+JupyterNotebook - Scala ์‚ฌ์šฉํ•˜๊ธฐ

์ด๋ฒˆ ํฌ์ŠคํŒ…์—๋Š” JypyterNotebook์— Scala์ปค๋„์„ ์ถ”๊ฐ€ํ•ด์„œ, Scala๋ฅผ ์‹คํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด์ „์— ์ƒ์„ฑํ•ด๋†“์€ ์ธํ”„๋ผ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•  ์˜ˆ์ •์ด๋‹ˆ, ์•ž ํฌ์ŠคํŒ…์„ ์ฐจ๊ทผํ•˜๊ทผํžˆ ๋”ฐ๋ผ์˜ค์‹  ํ›„ ์ง„ํ–‰ํ•ด์ฃผ์‹œ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค^-^ 1. Scala ์„ค์น˜ ์ €๋Š” python3.7์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ, pip3์œผ๋กœ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค! ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ spylon_kernal์„ ์„ค์น˜ํ•˜๊ณ , ์ปค๋„์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. [root@master ~]# pip3 install spylon-kernel [root@master ~]# python3 -m spylon_kernel install kernalspec ๋ช…๋ น์–ด๋กœ ์ปค๋„ ์ถ”๊ฐ€๊ฐ€ ์ž˜ ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ด๋ด…๋‹ˆ๋‹ค. [root@master ~]# jupyter kernelspec list 2. Jupyter Not..

Hadoop HDFS(3.3)+Spark(3.1.1) + JupyterNotebook ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #3

์ด ํฌ์ŠคํŒ…์€ ์ด์ „ ํฌ์ŠคํŒ…๊ณผ ์ด์–ด์ง‘๋‹ˆ๋‹ค. ๐Ÿ˜˜ Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #2 Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #2 ์ด ํฌ์ŠคํŒ…์€ ์ด์ „ ํฌ์ŠคํŒ…๊ณผ ์ด์–ด์ง‘๋‹ˆ๋‹ค. ์ด์ „ ํฌ์ŠคํŒ…์—์„œ EC2 ํ•œ๋Œ€๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๊ทธ ์ธ์Šคํ„ด์Šค์— ํ•„์š”ํ•œ ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ๋ชจ๋‘ ์„ค์น˜ํ•˜๊ณ , ํ™˜๊ฒฝ๋ณ€์ˆ˜์™€ ์„ค์ •ํŒŒ์ผ์„ ์ˆ˜์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ธ์Šคํ„ด์Šค๋ฅผ 1mini2.tistory.com ์ด์ „ ํฌ์ŠคํŒ… #1 ~ #2์—์„œ ๋ชจ๋“  ์ธํ”„๋ผ ๊ตฌ์ถ•์ด ์™„๋ฃŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ 4๋Œ€์˜ ์ธ์Šคํ„ด์Šค์— HDFS, YARN, Spark ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ์šด์˜์ค‘์ž…๋‹ˆ๋‹ค. ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰ ์ด๋ฒˆ ๋‹จ๊ณ„์—JupyterNotebook์„ ์„ค์น˜ํ•˜๊ณ  ์‹คํ–‰ํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.๐Ÿ˜˜ ํ•˜์ง€๋งŒ ๊ทธ ์ „์—! ๋ชจ๋“  ์„œ๋น„์Šค๊ฐ€ ์ •์ƒ์ธ์ง€ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค!! ์ธํ”„..

Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #2

์ด ํฌ์ŠคํŒ…์€ ์ด์ „ ํฌ์ŠคํŒ…๊ณผ ์ด์–ด์ง‘๋‹ˆ๋‹ค. ๐Ÿ˜˜ Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #1 Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #1 ์•ˆ๋…•ํ•˜์„ธ์š” ๐Ÿ˜๐Ÿ˜๐Ÿ˜๐Ÿ˜! ์ €๋ฒˆ ํฌ์ŠคํŒ…์—๋Š” ํ•˜๋‘ก HDFS ์˜ˆ์ „ ๋ฒ„์ „ (2.0)์„ ์„ค์น˜ํ–ˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—๋Š” ํ•˜๋‘กHDFS ์ตœ์‹ ๋ฒ„์ „์ธ 3.3๋ฅผ ์„ค์น˜ํ•˜๊ณ , ๊ทธ ์œ„์— Spark๋„ ํ•จ๊ป˜ ์„ค์น˜ํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. HDFS 3.3๋ฒ„ 1mini2.tistory.com ์ด์ „ ํฌ์ŠคํŒ…์—์„œ EC2 ํ•œ๋Œ€๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๊ทธ ์ธ์Šคํ„ด์Šค์— ํ•„์š”ํ•œ ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ๋ชจ๋‘ ์„ค์น˜ํ•˜๊ณ , ํ™˜๊ฒฝ๋ณ€์ˆ˜์™€ ์„ค์ •ํŒŒ์ผ์„ ์ˆ˜์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ธ์Šคํ„ด์Šค๋ฅผ AMI์ด๋ฏธ์ง€๋กœ ๋งŒ๋“  ํ›„, ๋ณต์ œํ•˜์—ฌ ์ด 4๋Œ€์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค์—ˆ์ฃ ! ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ด์ œ ๊ฐ ์—ญํ• ์— ๋งž์ถฐ Master/..

Hadoop HDFS(3.3)+Spark(3.1.1)! ๋ฌด์ž‘์ • ๋”ฐ๋ผํ•˜๊ธฐ #1

์•ˆ๋…•ํ•˜์„ธ์š” ๐Ÿ˜๐Ÿ˜๐Ÿ˜๐Ÿ˜! ์ €๋ฒˆ ํฌ์ŠคํŒ…์—๋Š” ํ•˜๋‘ก HDFS ์˜ˆ์ „ ๋ฒ„์ „ (2.0)์„ ์„ค์น˜ํ–ˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—๋Š” ํ•˜๋‘กHDFS ์ตœ์‹ ๋ฒ„์ „์ธ 3.3๋ฅผ ์„ค์น˜ํ•˜๊ณ , ๊ทธ ์œ„์— Spark๋„ ํ•จ๊ป˜ ์„ค์น˜ํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. HDFS 3.3๋ฒ„์ „์€ Java 1.8๋ฒ„์ „ ์ด์ƒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ^.^ (Apache Hadoop 3.3 and upper supports Java 8 and Java 11) ์ด๋ฒˆ ํฌ์ŠคํŒ…๊ณผ ์ด์–ด์ง€๋Š” ํฌ์ŠคํŒ…๋“ค์„ ๋”ฐ๋ผ์„œ ์ญ‰ ์ง„ํ–‰ํ•˜๋ฉด, HDFS+YARN+Spark ๊ตฌ์„ฑ์ด ์™„์„ฑ๋˜๊ณ , ๋งˆ์ง€๋ง‰์œผ๋กœ๋Š” ์ฃผํ”ผํ„ฐ ๋…ธํŠธ๋ถ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.๐Ÿ‘๐Ÿป [์„ค์น˜ํ•ด์•ผํ•  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ชฉ๋ก] 1. Java 1.8 2. HDFS 3.3 3. Scala 2.13.5 4. Spark 3.1.1 ์ž, ์ด์ œ ํ•œ๋ฒˆ ์„ค์น˜ํ•ด๋ด…์‹œ๋‹ค! 1. EC2 ์ธ..

728x90