๐ŸŒ ํ•™์Šต๋…ธํŠธ/๋‚™์„œ์žฅ

[๋ชจ๋‹ˆํ„ฐ๋ง์˜ ์ƒˆ๋กœ์šด ๋ฏธ๋ž˜ ๊ด€์ธก๊ฐ€๋Šฅ์„ฑ #3] 3์žฅ.๊ด€์ธก๊ฐ€๋Šฅ์„ฑ์˜ ์‹œ์ž‘ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค

mini_world 2024. 2. 21. 01:35

๐Ÿ“Œ ์ด ๋‚ด์šฉ์€ ์ฑ… ๋‚ด์šฉ ๋ฉ”๋ชจ์ž…๋‹ˆ๋‹ค.
๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ๊ณผ ๊ฒฝํ—˜ ๊ทธ๋ฆฌ๊ณ  ์—ฌ๋Ÿฌ ์žก๋‹ด์ด ์žˆ์œผ๋‹ˆ, ์ฑ…์„ ์ฝ์œผ๋ฉฐ ์˜๊ฒฌ์„ ๋‚˜๋ˆ„๊ณ ์‹ถ์€๋ถ„์ด ๋ด์ฃผ์‹œ๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค.

 

์ด๋ฒˆ 3์žฅ์—์„œ๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐ ํ•˜๊ณ ์žˆ๋‹ค.

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ž์ฒด๋ฅผ ์ž˜ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•œ ๋ง ๊ทธ๋Œ€๋กœ์˜ ์ƒํƒœ๊ณ„๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ ์žˆ๋‹ค.
ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ƒํƒœ๊ณ„์™€ ๋‚ด๋ถ€ ์›๋ฆฌ๋ฅผ ์ดํ•ดํ•˜๊ณ ์žˆ๋‹ค๋ฉด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ๊ทผ๋ณธ์ ์ธ ํ•ด๊ฒฐ์ฑ…์„ ์ œ์‹œํ•  ์ˆ˜ ์žˆ๊ธฐ๋•Œ๋ฌธ์— ์ดํ•ดํ•˜๋Š”๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค.

 

3.1 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๋ฐ”์ด๋„ˆ๋ฆฌ ๊ตฌ์„ฑ

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” 2012๋…„ ์‹œ์ž‘๋˜์—ˆ๊ณ , ๋…๋ฆฝ์ ์œผ๋กœ ๊ฐœ๋ฐœ๋˜๊ณ ์žˆ๋Š” ์˜คํ”ˆ์†Œ์Šค์ด๋‹ค.
2016๋…„์— Kubernetes์— ์ด์–ด ๋‘ ๋ฒˆ์งธ๋กœ  Cloud Native Computing Foundation์— ๊ฐ€์ž…ํ–ˆ๋‹ค. Prometheus๋Š” ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ํ™˜๊ฒฝ์—์„œ ์ •๋ง ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๊ณ  ์žˆ๋‹ค๋Š”๊ฒƒ ๊ทธ๋ฆฌ๊ณ  Kubernetes์™€ Prometheus๋Š” ๊ธด๋ฐ€ํ•˜๊ฒŒ ์—ฐ๊ณ„๋˜์–ด ์žˆ๋‹ค๋Š”๊ฑธ ์ถ”์ธกํ•ด๋ณผ ์ˆ˜์žˆ๋‹ค. 

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ž์ฒด ์„œ๋ฒ„ ์™ธ์—๋„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ƒํƒœ๊ณ„๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ ์žˆ๋‹ค. 
์—ฌ๊ธฐ์—์„œ๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„์ž์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ ์ƒํƒœ๊ณ„์— ํฌํ•จ๋œ Exporter, Operator, Adapter๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค.

 

Prometheus ๊ธฐ๋ณธ๊ฐœ๋… ์„ค๋ช…

FAQํŽ˜์ด์ง€์—๋Š” ์ด๋ ‡๊ฒŒ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋ฅผ ์ด๋ ‡๊ฒŒ ์ •์˜ํ•˜๊ณ ์žˆ๋‹ค. (์ฐธ๊ณ )

Prometheus is an open-source systems monitoring and alerting toolkit with an active ecosystem. It is the only system directly supported by Kubernetes and the de facto standard across the cloud native ecosystem.

์ฆ‰, ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ํ™œ๋ฐœํ•  ์ƒํƒœ๊ณ„๋ฅผ ๊ฐ€์ง„ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ์•Œ๋žŒ ํˆดํ‚ท์ด๋‹ค. ๋˜ํ•œ ํด๋ผ์šฐ๋“œ ๋„ค์ดํ‹ฐ๋ธŒ ์ƒํƒœ๊ณ„ ์ „๋ฐ˜์— ๊ฑธ์ณ ์‚ฌ์‹ค์ƒ์˜ ํ‘œ์ค€ ์ด๋ผ๊ณ  ๋งํ•˜๊ณ  ์žˆ๋‹ค. ๋ฉ‹์ง„ ์†Œ๊ฐœ๊ธ€์ด๋‹ค. ๐Ÿ˜š๐Ÿ‘๐Ÿ‘

์ด ์ฑ…์—์„œ๋Š” ์ด๋ ‡๊ฒŒ ๋งํ•˜๊ณ  ์žˆ๋‹ค.

"์ฟ ๋ฒ„๋„คํ‹ฐ์Šค์˜ ์ฃผ์š” ์—ญํ• ์ด ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ํฌํ•จํ•œ ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์šด์˜ํ•˜๊ณ  ์Šค์ผ€์ค„๋ง ํ•˜๋Š”๊ฒƒ์ด๋ผ๋ฉด, ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๊ธฐ๋ฐ˜์˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์›ํ™œํ•˜๊ฒŒ ๋Œ์•„๊ฐ€๋„๋ก ๋‹ค์–‘ํ•˜๊ณ  ๋ณต์žกํ•œ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค."
๋‹ค์‹œ ์ƒ๊ฐํ•ด๋ณด๋ฉด, ์–ด๋–ค ์˜คํ”ˆ์†Œ์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋˜์ง€ helm์œผ๋กœ ์„ค์น˜ํ•˜๋ฉด ๋Œ€๋ถ€๋ถ„ prometheus exporter ("/metric")์„ ํฌํ•จํ•˜๊ณ  ์žˆ์—ˆ๋‹ค. ์ •๋ง ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ์˜ ์‚ฌ์‹ค์ƒ์˜ ํ‘œ์ค€์ด ๋งž๋Š”๊ฒƒ ๊ฐ™๋‹ค.

๋’ค๋กœ ๋„˜์–ด๊ฐ€๊ธฐ ์ „์— ๊ธฐ๋Šฅ๊ณผ ์ปดํฌ๋„ŒํŠธ๋ฅผ ๊ณต์‹๋ฌธ์„œ ์—์„œ ๋ฒ ๊ปด์™”๋‹ค. ์ดํ•ดํ•˜๋Š”๋ฐ ๋„์›€์ด ๋ ๊ฒƒ๊ฐ™๋‹ค.

  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๊ธฐ๋Šฅ 
    • a multi-dimensional data model with time series data identified by metric name and key/value pairs
      -> ์ด๊ฑด ๋’ค์—์„œ ์„ค๋ช…ํ•œ๋‹ค.
    • PromQL, a flexible query language to leverage this dimensionality
    • no reliance on distributed storage; single server nodes are autonomous
    • time series collection happens via a pull model over HTTP
    • pushing time series is supported via an intermediary gateway
    • targets are discovered via service discovery or static configuration
    • multiple modes of graphing and dashboarding support
  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ปดํฌ๋„ŒํŠธ
    • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ƒํƒœ๊ณ„๋Š” ์—ฌ๋Ÿฌ ์ปดํฌ๋„ŒํŠธ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ, ๋Œ€๋ถ€๋ถ„ ์˜ต์…”๋„ํ•˜๊ฒŒ ์ ์šฉํ• ์ˆ˜ ์žˆ๋‹ค. 
    • Prometheus server  (scrapes and stores time series data)
    • client libraries 
    • push gateway  
    • exporters.  
    • alertmanager 

 

Prometheus  ์ฃผ์š” ์ปดํฌ๋„ŒํŠธ

Prometheus๋Š” ๋‹ค์–‘ํ•œ ๋„๊ตฌ๋ฅผ ํ†ตํ•ด ๊ฑฐ์˜ ๋ชจ๋“  ์œ ํ˜•์˜ ์ธํ”„๋ผ์™€ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ณ  ์šด์˜ ์ž๋™ํ™” ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.
์ด๋ฅผ ์œ„ํ•œ ๋ช‡๊ฐ€์ง€ ์ค‘์š”ํ•œ ๋ถ€๋ถ„์„ ์‚ดํŽด๋ณธ๋‹ค.

  • Prometheus Operator
    • Kubernetes ํ™˜๊ฒฝ ๋‚ด์—์„œ Prometheus์˜ ๋ฐฐํฌ์™€ ๊ด€๋ฆฌ๋ฅผ ์ž๋™ํ™”ํ•  ์ˆ˜์žˆ๋‹ค. 
    • Prometheus Operator์˜ Service Monitor์™€ Pod Monitor๋ฅผ ํ†ตํ•ด Kubernetes ์„œ๋น„์Šค ๋””์Šค์ปค๋ฒ„๋ฆฌ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค.
    • ์„œ๋น„์Šค์™€ ํŒŒ๋“œ์˜ ์ฆ๊ฐ์„ ๋ชจ๋‹ˆํ„ฐ๋ง ํ•˜๋ฉฐ, ์ฆ๊ฐ ๋ฐœ์ƒ ์‹œ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๊ตฌ์„ฑํŒŒ์ผ์„ ์—…๋ฐ์ดํŠธ ํ•œ๋‹ค. 
  • Prometheus Exporter
    • Exporter๋Š” Prometheus๊ฐ€ ์ง์ ‘์ ์œผ๋กœ ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•  ์ˆ˜ ์—†๋Š” ๋‹ค์–‘ํ•œ ์†Œ์Šค๋กœ๋ถ€ํ„ฐ ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•˜์—ฌ Prometheus๊ฐ€ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค.
    • ๊ธฐ๋ณธ์ ์œผ๋กœ ์ œ๊ณตํ•˜๋Š” Exporter๊ฐ€ ์กด์žฌํ•˜๋ฉฐ, ๊ทธ ์™ธ์—๋„ ์ปค์Šคํ…€ ๋ฉ”ํŠธ๋ฆญ ๊ฐœ๋ฐœ์„ ์œ„ํ•œ API,SDK๋ฅผ ์ œ๊ณตํ•œ๋‹ค.
  • Prometheus Adapter
    • Adapter๋Š” ์ฃผ๋กœ Kubernetes ํ™˜๊ฒฝ์—์„œ Prometheus์—์„œ ์ˆ˜์ง‘ํ•œ ๋ฉ”ํŠธ๋ฆญ์„ ์‚ฌ์šฉํ•˜์—ฌ Kubernetes์˜ ์ž๋™ ์Šค์ผ€์ผ๋ง ๊ธฐ๋Šฅ(์˜ˆ: Horizontal Pod Autoscaler, HPA)๊ณผ ์—ฐ๋™ํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค. Prometheus์˜ ๋ฉ”ํŠธ๋ฆญ์„ Kubernetes์˜ Custom Metrics API๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ณ€ํ™˜ํ•˜๋Š” ์ค‘๊ฐ„์ž ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

์•„๋ž˜ ๊ทธ๋ฆผ์€ PrometheusThanos ๊ด€๋ จ ์•„ํ‚คํ…์ณ์ธ๋ฐ, ์—ฌ๊ธฐ ์•ˆ์— Operator, Exporter, Adapter๊ฐ€ ๋ชจ๋‘ ์‰ฝ๊ฒŒ ํ‘œ์‹œ๋˜์–ด์žˆ์–ด ๊ฐ€์ ธ์™”๋‹ค.

์ถœ์ฒ˜: https://clux.dev/imgs/prometheus/ecosystem-miro.webp

 

 

 

3. 2 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค

 

3.2.1. ๋ฐ์ดํ„ฐ ํ˜•์‹

Prometheus Exporter๊ฐ€ /metrics ๊ฒฝ๋กœ๋ฅผ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๋ฉด, prometheus ๋Š” pull๋ฐฉ์‹์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์ €์žฅํ•œ๋‹ค.

https://www.slideshare.net/fabxc/prometheus-storage

์ด ๋ฐ์ดํ„ฐ ํ˜•์‹์„ ๋ณด๋ฉด, http_requests_total ์ด๋ผ๋Š” metric์ด๋ฆ„์œผ๋กœ ์—ฌ๋Ÿฌ ๋ ˆ์ด๋ธ”์ด ์žˆ๋‹ค.
status ๋ ˆ์ด๋ธ”์€ "200", "400", "500" ๋“ฑ์˜ ๊ฐ’์ด ์žˆ์„ ์ˆ˜ ์žˆ๊ณ , method ๋ ˆ์ด๋ธ”์€ "GET", "POST"๊ฐ’์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•  ๋•Œ Prometheus๋Š” ๊ฐ๊ฐ์˜ ๋ ˆ์ด๋ธ” ์กฐํ•ฉ์— ๋Œ€ํ•ด ๋ณ„๋„์˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

  1. http_requests_total {status="200", method="GET"}
  2. http_requests_total {status="200", method="POST"}
  3. http_requests_total {status="400", method="GET"}
  4. http_requests_total {status="400", method="POST"}
  5. http_requests_total {status="500", method="GET"}
  6. http_requests_total {status="500", method="POST"}

์—ฌ๊ธฐ์—์„œ ์ค‘์š”ํ•œ ๊ฐœ๋…์ด ์นด๋””๋„๋ฆฌํ‹ฐ์ด๋‹ค.

  • ์นด๋””๋„๋ฆฌํ‹ฐ๋ž€?
    • ํŠน์ • ๋ฉ”ํŠธ๋ฆญ์— ๋Œ€ํ•ด ๊ณ ์œ ํ•œ ๋ ˆ์ด๋ธ” ์กฐํ•ฉ์˜ ์ด ์ˆ˜๋ฅผ ๋งํ•œ๋‹ค.
    • ์˜ˆ๋ฅผ๋“ค์–ด http_requests_total ๋ฉ”ํŠธ๋ฆญ์— ๋Œ€ํ•ด status์™€ method ๋ผ๋ฒจ์ด ๊ฐ๊ฐ ์„ธ ๊ฐ€์ง€์™€ ๋‘ ๊ฐ€์ง€ ๊ฐ€๋Šฅํ•œ ๊ฐ’์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•  ๋•Œ, ์ตœ๋Œ€ ์นด๋””๋„๋ฆฌํ‹ฐ๋Š” 6(status์˜ 3๊ฐœ ๊ฐ’ × method์˜ 2๊ฐœ ๊ฐ’)์ด๋ฉฐ, Prometheus์— ์˜ํ•ด ๋ณ„๋„์˜ ์‹œ๊ณ„์—ด๋กœ ๊ด€๋ฆฌ๋œ๋‹ค.
  • ์นด๋””๋„๋ฆฌํ‹ฐ๊ฐ€ ์™œ ์ค‘์š”ํ•œ๊ฐ€?
    • ๊ฐ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค์™€ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค.
    • ๋ฉ”ํŠธ๋ฆญ์— ์ˆ˜์—…์ด ์กฐํ•ฉ๋œ ์—ฌ๋Ÿฌ ์ฐจ์›์€ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ์†Œ์œ„ ์นด๋””๋„๋ฆฌํ‹ฐ ์ŠคํŒŒ์ดํฌ๋ฅผ ์•ผ๊ธฐํ•  ์ˆ˜์žˆ๋‹ค.

 

3.2.2. ๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ

Prometheus์—์„œ ์Šคํฌ๋ž˜ํ•‘ ๋œ ๋ฐ์ดํ„ฐ๋Š” ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์ €์žฅ๋œ๋‹ค.
ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ์–ด๋–ป๊ฒŒ ์ €์žฅํ•˜๊ณ  ๊ด€๋ฆฌํ•˜๋Š”์ง€ ํ™•์ธํ•ด๋ณด์ž.

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค Time Series Database(TSDB)๋Š” ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์ด๋ฉฐ ์ด๋Ÿฐ ํŠน์ง•์ด ์žˆ๋‹ค.

  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค TSDBํŠน์ง•
    • LRU ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ, ๊ฐ€์žฅ ์˜ค๋žซ๋™์•ˆ ์ฐธ์กฐํ•˜์ง€ ์•Š์€ ํŽ˜์ด์ง€๋ฅผ ๊ต์ฒดํ•œ๋‹ค.
    • ๋ฉ”๋ชจ๋ฆฌ ํŽ˜์ด์ง•(paging)์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ์ผ์ • ํฌ๊ธฐ์ธ ํŽ˜์ด์ง€๋กœ ๋ถ„ํ• ํ•ด์„œ ๋ฉ”๋ชจ๋ฆฌ์— ์ ์žฌํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.
    • ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ(์•ž์œผ๋กœ๋Š” chunk ๋ผ๋Š” ์šฉ์–ด๋กœ ๋ถ€๋ฅธ๋‹ค)๋Š” ๋ธ”๋กํ˜•ํƒœ๋กœ ๋งŒ๋“ค์–ด ๋””์Šคํฌ์— ์ €์žฅํ•œ๋‹ค. 
    • ๋ธ”๋ก์€ ๋‹ค์ˆ˜์˜ ํŽ‘ํฌ๋ฅผ ํฌํ•จํ•˜์—ฌ ์ธ๋ฑ์Šค ๋“ฑ์˜ ๋ฐ์ดํ„ฐ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์ธ๋ฑ์Šค๋Š” ๋ฐ์ดํ„ฐ์˜ ์œ„์น˜์™€ ์ฐธ์กฐ์— ๋Œ€ํ•œ ์ •๋ณด ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋ฐ์ดํ„ฐ๋ฅผ ๋น ๋ฅด๊ฒŒ ์กฐํšŒํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค.
    • ๋ฐ์ดํ„ฐ์…‹์€ ๋‹ค์ˆ˜ ๋ฐ์ดํ„ฐ๊ทธ๋ฃน์„ ์˜๋ฏธํ•˜๋ฉฐ, ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋Š” ๋Œ€์‹œ๋ณด๋“œ์—์„œ ์‹œ๊ณ„์—ด๋กœ ์ถœ๋ ฅ๋˜๋Š” ๊ฐœ๋ณ„ ๋ฐ์ดํ„ฐ๋ฅผ ๋งํ•œ๋‹ค.
  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ 
    • ์‹ค์ œ ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด๋””์— ์ €์žฅ๋˜์—ˆ๋Š”์ง€ ๋ณด๋ ค๋ฉด ps๋กœ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. 
# ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์Šคํ† ๋ฆฌ์ง€ ์–ด๋””์ธ์ง€ ํ™•์ธํ•˜๊ธฐ
/prometheus $ ps
PID   USER     TIME  COMMAND
    1 1000     12:09 /bin/prometheus --storage.tsdb.path=/prometheus --storage.tsdb.retention.time=24h (๋‚˜๋จธ์ง€์ƒ๋žต)
   77 1000      0:00 /bin/sh

# ๊ตฌ์กฐ ํ™•์ธํ•˜๊ธฐ
/prometheus $ tree
.
โ”œโ”€โ”€ 01HQ4SP85M2HZ1T160MKJ8ETAX
โ”‚   โ”œโ”€โ”€ chunks
โ”‚   โ”‚   โ””โ”€โ”€ 000001
โ”‚   โ”œโ”€โ”€ index
โ”‚   โ”œโ”€โ”€ meta.json
โ”‚   โ””โ”€โ”€ tombstones
โ”œโ”€โ”€ 01HQ53DJWQYFHF0R0W81TP1M51
โ”‚   โ”œโ”€โ”€ chunks
โ”‚   โ”‚   โ””โ”€โ”€ 000001
โ”‚   โ”œโ”€โ”€ index
โ”‚   โ”œโ”€โ”€ meta.json
โ”‚   โ””โ”€โ”€ tombstones
โ”œโ”€โ”€ 01HQ57N0M547EXEQF95J3H077B       # ๋ธ”๋ก 
โ”‚   โ”œโ”€โ”€ chunks                       # ์ฒญํฌํŒŒ์ผ
โ”‚   โ”‚   โ””โ”€โ”€ 000001
โ”‚   โ”œโ”€โ”€ index                        # ์ƒ‰์ธ์„ ์œ„ํ•œ ๋ ˆ์ด๋ธ”๊ณผ ์‹œ๊ฐ„ ์ธ๋ฑ์Šค ํŒŒ์ผ
โ”‚   โ”œโ”€โ”€ meta.json                    # ๋ธ”๋ก์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
โ”‚   โ””โ”€โ”€ tombstones                   # ์‚ญ์ œ์—ฌ๋ถ€ ํ‘œ์‹œ ํŒŒ์ผ
โ”œโ”€โ”€ chunks_head                      # ์ฒญํฌํ—ค๋“œ
โ”‚   โ”œโ”€โ”€ 000007
โ”‚   โ””โ”€โ”€ 000008
โ”œโ”€โ”€ lock
โ”œโ”€โ”€ queries.active
โ””โ”€โ”€ wal                              # wal ํŒŒ์ผ
    โ”œโ”€โ”€ 00000004
    โ”œโ”€โ”€ 00000005
    โ”œโ”€โ”€ 00000006
    โ”œโ”€โ”€ 00000007
    โ””โ”€โ”€ checkpoint.00000003          # ๋ณต๊ตฌ๋ฅผ ์œ„ํ•œ checkpoint walํŒŒ์ผ
        โ””โ”€โ”€ 00000000
  • ๋ธ”๋ก(Blocks)
    • Prometheus๋Š” ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋ณ€์˜ ๋ธ”๋ก์œผ๋กœ ์ €์žฅํ•œ๋‹ค.
    • ๊ฐ ๋ธ”๋ก์€ ํŠน์ • ์‹œ๊ฐ„ ๋ฒ”์œ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์—ฌ๊ธฐ์„œ 01HQ57N0M547EXEQF95J3H077B ๊ฐ™์€ ๋””๋ ‰ํ† ๋ฆฌ๋Š” ๊ฐœ๋ณ„ ๋ธ”๋ก์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. 
    • ๋ธ”๋ก์˜ ๊ตฌ์„ฑ์š”์†Œ
      • chunks: ์‹ค์ œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์ด ์ €์žฅ๋œ ํŒŒ์ผ์ด๋ฉฐ, ์ฒญํฌ ํŒŒ์ผ(000001 ๋“ฑ)์€ ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ์••์ถ•ํ•˜์—ฌ ์ €์žฅํ•œ๋‹ค.
      • index: ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฟผ๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•๋Š” ์ธ๋ฑ์Šค ํŒŒ์ผ์ด๋‹ค. ๋ฉ”ํŠธ๋ฆญ ์ด๋ฆ„, ๋ ˆ์ด๋ธ”, ํƒ€์ž„์Šคํƒฌํ”„ ๋“ฑ์— ๋Œ€ํ•œ ์ธ๋ฑ์Šค๋ฅผ ํฌํ•จํ•˜์—ฌ, ์ฟผ๋ฆฌ ์‹œ ํ•ด๋‹น ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ๋น ๋ฅด๊ฒŒ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค.
      • meta.json: ํ•ด๋‹น ๋ธ”๋ก์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๋Š” ํŒŒ์ผ์ž…๋‹ˆ๋‹ค. ๋ธ”๋ก์˜ ์‹œ๊ฐ„ ๋ฒ”์œ„, ๋ฒ„์ „ ์ •๋ณด, ๋ธ”๋ก์— ๋Œ€ํ•œ ๋‹ค๋ฅธ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ๋“ฑ์ด ํฌํ•จ๋˜๋ฉฐ ์‚ฌ๋žŒ์ด ์ฝ์„ ์ˆ˜ ์žˆ๋‹ค.
      • tombstones: ์‚ญ์ œ๋œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œ์‹œํ•˜๋Š” ํŒŒ์ผ์ด๋‹ค. ๋ฐ์ดํ„ฐ ์‚ญ์ œ ์š”์ฒญ์ด ์žˆ์„ ๊ฒฝ์šฐ, ์‹ค์ œ ๋ฐ์ดํ„ฐ๋Š” ์ฆ‰์‹œ ์‚ญ์ œ๋˜์ง€ ์•Š๊ณ , ์ด ํŒŒ์ผ์— ์‚ญ์ œ ๋งˆํฌ๊ฐ€ ํ‘œ์‹œ๋œ๋‹ค.
  • ์ฒญํฌ ํ—ค๋“œ(chunks_head)
    • ํ˜„์žฌ ์“ฐ๊ธฐ ์ž‘์—… ์ค‘์ธ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ์ž„์‹œ ๊ณต๊ฐ„์ด๋‹ค.
    • Prometheus๋Š” ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ๋จผ์ € ์ด๊ณณ์— ์ €์žฅํ•œ ํ›„, ์ผ์ • ์‹œ๊ฐ„์ด ์ง€๋‚˜๋ฉด ๋ถˆ๋ณ€ ๋ธ”๋ก์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์˜ฎ๊ธด๋‹ค.
  • WAL(Write-Ahead Logging)
    • wal: WAL์€ ์“ฐ๊ธฐ ์„ ํ–‰ ๋กœ๊น…(Write-Ahead Logging)์˜ ์•ฝ์ž๋กœ, ๋ฐ์ดํ„ฐ๋ฅผ ๋ธ”๋ก์œผ๋กœ ์˜ฎ๊ธฐ๊ธฐ ์ „์— ๋ชจ๋“  ์“ฐ๊ธฐ ์ž‘์—…์„ ๋กœ๊น…ํ•œ๋‹ค. ์ด๋Š” ๋ฐ์ดํ„ฐ ๋ฌด๊ฒฐ์„ฑ๊ณผ ๋ณต๊ตฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์ด๋‹ค.
    • checkpoint.00000003: ์ฒดํฌํฌ์ธํŠธ ํŒŒ์ผ์€ WAL์˜ ํŠน์ • ์‹œ์ ์—์„œ์˜ ์Šค๋ƒ…์ƒท์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.
      ์ด๋Š” ์‹œ์Šคํ…œ ์žฌ์‹œ์ž‘ ์‹œ WAL์˜ ์ „์ฒด ์Šค์บ” ์—†์ด ๋น ๋ฅธ ๋ณต๊ตฌ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ์žฅ์• ๊ฐ€ ๋ฐœ์ƒํ•ด์„œ ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ์— ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธฐ๋ฉด WAL๋ฅผ ํ•˜์šฉํ•ด ๋ฉ”๋ชจ๋ฆฌ์—์„œ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋ณต๊ตฌํ•œ๋‹ค.
  • ๊ธฐํƒ€ ํŒŒ์ผ
    • lock: ๋™์‹œ์„ฑ ์ œ์–ด์™€ ๊ฐ™์€ ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๋ฝ ํŒŒ์ผ
    • queries.active: ํ˜„์žฌ ์ง„ํ–‰ ์ค‘์ธ ์ฟผ๋ฆฌ ์ •๋ณด๋ฅผ ์ €์žฅํ•˜๋Š” ํŒŒ์ผ

 

3.2.3. ๋ธ”๋ก๊ด€๋ฆฌ

  • ์ƒ˜ํ”Œ
    • ์ƒ˜ํ”Œ(sample)์€ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ํ•ต์‹ฌ ๋‹จ์œ„์ด๋‹ค.
    • ๊ฐ๊ฐ์˜ ์ƒ˜ํ”Œ์€ ํŠน์ • ์‹œ์ ์—์„œ์˜ ๋ฉ”ํŠธ๋ฆญ ๊ฐ’๊ณผ ๊ทธ ์‹œ์ ์„ ๋‚˜ํƒ€๋‚ด๋Š” ํƒ€์ž„์Šคํƒฌํ”„๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.
      • ๋ฉ”ํŠธ๋ฆญ๊ฐ’(float64): ์ธก์ •ํ•˜๊ณ ์ž ํ•˜๋Š” ์‹ค์ œ ๋ฐ์ดํ„ฐ ๊ฐ’์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, http_requests_total ๋ฉ”ํŠธ๋ฆญ์˜ ์ƒ˜ํ”Œ ๊ฐ’์€ ํŠน์ • ์‹œ์ ์—์„œ์˜ HTTP ์š”์ฒญ ์ด ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
      • ํƒ€์ž„์Šคํƒฌํ”„: ๋ฉ”ํŠธ๋ฆญ ๊ฐ’์ด ๊ธฐ๋ก๋œ ์ •ํ™•ํ•œ ์‹œ๊ฐ„์ด๋‹ค. 
  • ๋ธ”๋ก ์ƒ์„ฑ
    • ์‹œ๊ณ„์—ด์€ ์‹œ๊ฐ„์ˆœ์œผ๋กœ ์ธ๋ฑ์‹ฑ ๋˜๋Š” ์ˆซ์ž๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์˜ ์‹œํ€€์Šค๋กœ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋‹ค.
    • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋Š” ์ผ์ •ํ•œ ์‹œ๊ฐ„ ๊ฐ„๊ฒฉ์œผ๋กœ ์ˆ˜์ง‘๋˜๋ฉฐ, ์ด๋Ÿฐ ํ˜•์‹์„ ๊ทธ๋ž˜ํ”ฝ ํ˜•์‹์œผ๋กœ ํ‘œํ•œํ•˜์ž๋ฉด x์ถ•์€ ์‹œ๊ฐ„, y์ถ•์€ ๋ฐ์ดํ„ฐ ๊ฐ’์ด๋‹ค. (์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๋ณ€ํ™” ํ‘œ์‹œ)
    • ๋ฉ”๋ชจ๋ฆฌ์— ์ˆ˜์ง‘๋œ ์ƒ˜ํ”Œ์€ ๊ธฐ๋ณธ ๋‘์‹œ๊ฐ„ ๋‹จ์œ„๋กœ ๋””์Šคํฌ๋กœ ํ”Œ๋Ÿฌ์‹œ ๋˜๊ณ  ๋ธ”๋ก์ด ์ƒ์„ฑ๋œ๋‹ค.
  • ๋ธ”๋ก ๋ณ‘ํ•ฉ
    • ํฌ๊ธฐ๊ฐ€ ์ž‘์€ ํŒŒ์ผ๊ฐ€ ๋ฐ์ดํ„ฐ๊ฐ€ ๋‹ค์ˆ˜ ์กด์žฌํ•˜๋ฉด ๋ชจ๋“  ํŒŒ์ผ์— ๋Œ€ํ•œ ์ธ๋ฑ์Šค๋ฅผ ๋งŒ๋“ค๊ณ  ๊ฒ€์ƒ‰ํ•ด์•ผํ•˜๋ฏ€๋กœ ์กฐํšŒ์†๋„๊ฐ€ ๋Š๋ ค์ง„๋‹ค. ๋ฐ˜๋ฉด ํŒŒ์ผ ํฌ๊ธฐ๊ฐ€ ๋„ˆ๋ฌด ํฌ๋ฉด ํšจ์œจ์„ฑ์ด ๋–จ์–ด์ง„๋‹ค.
    • ํŒŒ์ผ์˜ ๊ฐœ์ˆ˜์™€ ํŒŒ์ผ์˜ ํฌ๊ธฐ๋ฅผ ์ ์ ˆํ•˜๊ฒŒ ์œ ์ง€ํ•˜๋Š”๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค.
    • tsdb์™€ ๊ด€๋ จ๋œ ์ฃผ์š” ์˜ต์…˜์ด๋‹ค.
      • --storage.tsdb.min-block-duration :
        TSDB์—์„œ ์ƒ์„ฑ๋  ์ˆ˜ ์žˆ๋Š” ์ตœ์†Œ ๋ธ”๋ก์˜ ์‹œ๊ฐ„ ๋ฒ”์œ„๋ฅผ ์„ค์ •ํ•œ๋‹ค. ์ด ์‹œ๊ฐ„ ๋ฒ”์œ„ ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ๋ธ”๋ก์œผ๋กœ ์ €์žฅํ•œ๋‹ค.
        ์ด ๊ฐ’์€ Prometheus๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž์ฃผ ์ปดํŒฉ์…˜(compaction, ์—ฌ๋Ÿฌ ๋ธ”๋ก์„ ๋ณ‘ํ•ฉํ•˜๋Š” ๊ณผ์ •)ํ• ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐ ์˜ํ–ฅ์„ ์ค€๋‹ค.
      • --storage.tsdb.min-block-duration:
        ์ตœ๋Œ€ ๋ธ”๋ก ๊ธฐ๊ฐ„์„ ์„ค์ •ํ•œ๋‹ค. max-block-duration์€ ์ปดํŒฉ์…˜ ๊ณผ์ •์—์„œ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฏธ๋‹ˆ๋ฉˆ ๋ธ”๋ก์„ ํ•˜๋‚˜์˜ ํฐ ๋ธ”๋ก์œผ๋กœ ๋ณ‘ํ•ฉํ•  ๋•Œ ์ƒ์„ฑ๋  ์ˆ˜ ์žˆ๋Š” ์ตœ๋Œ€ ๋ธ”๋ก์˜ ์‹œ๊ฐ„ ๋ฒ”์œ„๋ฅผ ์ •์˜ํ•˜๋ฉฐ, ์ด ์„ค์ •์„ ํ†ตํ•ด ๊ธด ์‹œ๊ฐ„ ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.
        Prometheus์—์„œ๋Š” min-block-duration์œผ๋กœ ์„ค์ •๋œ ์‹œ๊ฐ„ ๋™์•ˆ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ๋ธ”๋ก์œผ๋กœ ์ €์žฅํ•˜๊ณ , ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ด๋Ÿฐ ๋ธ”๋ก์ด max-block-duration์— ๋„๋‹ฌํ•˜๋ฉด ์ด๋“ค์„ ํ•˜๋‚˜์˜ ํฐ ๋ธ”๋ก์œผ๋กœ ๋ณ‘ํ•ฉ๋œ๋‹ค.
      • --storage.tsdb.retention.time=24h : ๋ฐ์ดํ„ฐ๋ฅผ ์–ผ๋งˆ๋‚˜ ์˜ค๋ž˜ ๋ณด๊ด€ํ• ์ง€๋ฅผ ์ •ํ•˜๋ฉฐ, ๊ทธ ์ด์ „์˜ ๋ฐ์ดํ„ฐ๋Š” ์ž๋™์œผ๋กœ ์‚ญ์ œํ•œ๋‹ค.
      • --storage.tsdb.path=/prometheus : ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด๋””์— ์ €์žฅ๋ ์ง€ ์ง€์ •ํ•œ๋‹ค.
  • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๋กœ์ปฌ ์Šคํ† ๋ฆฌ์ง€
    • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ๋กœ์ปฌ์Šคํ† ๋ฆฌ์ง€์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•œ๋‹ค.
    • ์ตœ์ƒ์œ„ ๋ ˆ๋ฒจ์—์„œ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์Šคํ† ๋ฆฌ์ง€ ๋””์ž์ธ์€ ํ˜„์žฌ ์ €์žฅ๋œ ๋ชจ๋“  ๋ ˆ์ด๋ธ” ๋ชฉ๋ก๊ณผ ์ž์ฒด ์‹œ๊ณ„์—ด ํ˜•์‹์„ ์‚ฌ์šฉํ•˜๋Š” ์ƒ‰์ธ์˜ ์กฐํ•ฉ์œผ๋กœ ์ด๋ฃจ์–ด์ ธ์žˆ๋‹ค.
  • ๋ฐ์ดํ„ฐ ํ๋ฆ„
    • ๋ฉ”๋ชจ๋ฆฌ
      • ์ตœ์‹  ๋ฐ์ดํ„ฐ ๋ฐฐ์น˜๋Š” ๊ธฐ๋ณธ ์ตœ๋Œ€ 2์‹œ๊ฐ„ ๋™์•ˆ ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅํ•œ๋‹ค.
      • ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๋Š” ํ•˜๋‚˜์ด์ƒ์˜ ๋ฐ์ดํ„ฐ ์ฒญํฌ(chunk) ํ˜•ํƒœ๋กœ ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅ๋œ๋‹ค. ์ด ๋ฐฉ์‹์€ ์ฟผ๋ฆฌ ์‘๋‹ต ์†๋„๋ฅผ ๋น ๋ฅด๊ฒŒ ํ•˜๋ฉฐ, ๋””์Šคํฌ I/O๋ฅผ ์ค„์ธ๋‹ค. ๋ฉ”๋ชจ๋ฆฌ์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•จ์œผ๋กœ์จ, ์ž์ฃผ ์•ก์„ธ์Šคํ•˜๋Š” ์ตœ์‹  ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ฟผ๋ฆฌ ์†๋„๊ฐ€ ํ–ฅ์ƒ๋˜๊ณ , ๋ฐ˜๋ณต์ ์ธ ๋””์Šคํฌ ์“ฐ๊ธฐ๊ฐ€ ๋ฐฉ์ง€๋œ๋‹ค.
      • ํ—ค๋“œ ์ฒญํฌ (Head Chunks)
        • ํ—ค๋“œ ์ฒญํฌ๋Š” Prometheus์—์„œ ํ˜„์žฌ ํ™œ์„ฑํ™”๋˜์–ด ๋ฐ์ดํ„ฐ๊ฐ€ ์ˆ˜์ง‘๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋‚ด์˜ ์ฒญํฌ๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
        • ์ฒญํฌ๋Š” ๊ฐ€์žฅ ์ตœ์‹ ์˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€์†์ ์œผ๋กœ ์ด ์ฒญํฌ์— ์ถ”๊ฐ€๋œ๋‹ค.
        • ํ—ค๋“œ ์ฒญํฌ๋Š” ์ผ์ • ์‹œ๊ฐ„์ด ์ง€๋‚˜๊ฑฐ๋‚˜ ์ฒญํฌ๊ฐ€ ํŠน์ • ํฌ๊ธฐ์— ๋„๋‹ฌํ•˜๋ฉด ๋””์Šคํฌ์— ์ €์žฅ๋˜๋Š” ๋ถˆ๋ณ€ ์ฒญํฌ(immutable chunks)๋กœ ๋ณ€ํ™˜๋œ๋‹ค.
      • ์—๋ฒ„ํ„ฐ๋ธ” ์ฒญํฌ (Evictable Chunks, LRU ๊ธฐ๋ฐ˜)
        • ์—๋ฒ„ํ„ฐ๋ธ” ์ฒญํฌ๋Š” ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ๋ฅผ ์œ„ํ•ด LRU (Least Recently Used) ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋”ฐ๋ผ ๋ฉ”๋ชจ๋ฆฌ์—์„œ ์ œ๊ฑฐ๋  ์ˆ˜ ์žˆ๋Š” ์ฒญํฌ๋ฅผ ์˜๋ฏธํ•œ๋‹ค.
        • Prometheus๋Š” ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์ผ์ • ์‹œ๊ฐ„ ๋™์•ˆ ์ฟผ๋ฆฌ์— ์‚ฌ์šฉ๋˜์ง€ ์•Š์€ ์ฒญํฌ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์—์„œ ํ•ด์ œํ•œ๋‹ค.
    • ๋กœ๊ทธ ์„ ํ–‰ ๊ธฐ์ž… (Write-Ahead Logging, WAL)
      • Prometheus๋Š” ๋ฐ์ดํ„ฐ ์†์‹ค์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด Write-Ahead Logging (WAL)์„ ์‚ฌ์šฉํ•œ๋‹ค.
      • ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ์‹œ์Šคํ…œ ์žฅ์• ์™€ ๊ฐ™์ด ์˜ˆ์ƒํ•˜์ง€๋ชปํ•œ ์ƒํ™ฉ์—๋„ ์•ˆ์ „ํ•˜๊ฒŒ ๋ณดํ˜ธ๋  ์ˆ˜ ์žˆ๋„๋กํ•œ๋‹ค.
      • ๋ฐ์ดํ„ฐ๊ฐ€ ๋””์Šคํฌ์— ์•ˆ์ „ํ•˜๊ฒŒ ์ €์žฅ๋˜๊ธฐ ์ „์— ๋จผ์ € ๋กœ๊ทธ๋กœ ๊ธฐ๋ก๋˜๋Š” ๋ฐฉ์‹์œผ๋กœ, ์‹œ์Šคํ…œ ์žฌ์‹œ์ž‘ ํ›„์—๋„ ๋ฐ์ดํ„ฐ ๋ณต๊ตฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๊ฒŒ๋œ๋‹ค.
    • ๋””์Šคํฌ
      • ์„ค์ •๋œ ์‹œ๊ฐ„(๊ธฐ๋ณธ์ ์œผ๋กœ 2์‹œ๊ฐ„)์ด ์ง€๋‚˜๋ฉด, ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ ์ฒญํฌ๋Š” ๋””์Šคํฌ๋กœ ์˜ฎ๊ฒจ์ง€๋ฉฐ, ์ด๋•Œ ์ฒญํฌ๋Š” ๋ถˆ๋ณ€(immutable)์˜ ํ˜•ํƒœ๋กœ ์ €์žฅ๋œ๋‹ค.
      • ๋ฐ์ดํ„ฐ ์‚ญ์ œ๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ, Prometheus๋Š” ์‚ญ์ œ ํ‘œ์‹œ๋ฅผ ์œ„ํ•œ 'tombstone' ํŒŒ์ผ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์‹ค์ œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฆ‰์‹œ ์‚ญ์ œํ•˜์ง€ ์•Š๊ณ , ๋‚˜์ค‘์— ์ปดํŒฉ์…˜(compaction) ๊ณผ์ •์—์„œ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์ œ๊ฑฐ๋  ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œ์‹œํ•œ๋‹ค.

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ์ œ๊ณตํ•˜๋Š” ๋‚ด๋ถ€ ๋ฉ”ํŠธ๋ฆญ์„ ํ†ตํ•ด ํ”Œ๋Ÿฌ์‹œ ์ƒํƒœ, WAL,TSDB์˜ ์ƒํƒœ์™€ ํฌ๊ธฐ, ์ ์žฌ๋˜๋Š” ๋ฉ”ํŠธ๋ฆญ์˜ ๊ฐœ์ˆ˜๋“ฑ์„ ๋ชจ๋‹ˆํ„ฐ๋ง ํ•  ์ˆ˜ ์žˆ๋‹ค.
ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ง์ด ๋˜์ง€ ์•Š๊ณ  ๋นˆ๋ฒˆํ•œ ์žฅ์•  ๋ฐœ์ƒ์œผ๋กœ ์ธํ•ด ๋ฉ”ํŠธ๋ฆญ์˜ ์œ ์‹ค์ด ๋ฐœ์ƒํ•˜๋ฏ€๋กœ ์ฃผ์˜ํ•ด์•ผ ํ•œ๋‹ค.

 

3. 3 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๊ตฌ์„ฑ

helm chart๋กœ ์„ค์น˜ํ•˜๋ฉด ๋‚ด๋ถ€ ๊ตฌ์กฐ๋ฅผ ์ดํ•ดํ•˜๋Š”๊ฒŒ ์‰ฝ์ง€ ์•Š์•„, ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ๋‹ค์šด๋ฐ›๊ณ , ๊ตฌ์„คํŒŒ์ผ์„ ์ž‘์„ฑํ•˜๊ณ  ์„ค์น˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ๋ ค์ค€๋‹ค.

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์˜คํผ๋ ˆ์ดํ„ฐ (https://github.com/prometheus-operator/kube-prometheus)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ•œ๋‹ค.

# ๋กœ์ปฌ minikube ์‹œ์ž‘
minikube start

์ด๋•Œ, ๋‚ด minikube์— ํ• ๋‹น๋œ cpu/mem๊ฐ’์„ ์ž˜ ํ™•์ธํ•ด์•ผํ•œ๋‹ค.
๐Ÿ”ฅ  Creating docker container (CPUs=2, Memory=8100MB) ...

์ด์ œ ์†Œ์Šค์ฝ”๋“œ๋ฅผ ๋‹ค์šด๋ฐ›๊ณ  ๋ช‡๊ฐ€์ง€ ์ˆ˜์ •ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.
๋กœ์ปฌ์— ๋„์šฐ๋Š”๊ฑฐ๋ผ ๋ฆฌ์†Œ์Šค๊ฐ€ ๋ชจ์ž๋ผ์„œ Replica ๋ฐ ํ• ๋‹น ํฌ๊ธฐ๋ฅผ ์ข€ ์ค„์—ฌ์ค˜์•ผ ํ•œ๋‹ค. 

# ์†Œ์Šค ๋‹ค์šด๋กœ๋“œ
git clone https://github.com/prometheus-operator/kube-prometheus.git

# ๋””๋ ‰ํ† ๋ฆฌ ์ด๋™
cd kube-prometheus

# manifests/alertmanager-alertmanager.yaml ์ˆ˜์ • 
spec:
  replicas: 1
  resources:
    limits:
      cpu: 50m
      memory: 50Mi
    requests:
      cpu: 4m
      memory: 50Mi
      
# manifests/prometheus-prometheus.yaml ์ˆ˜์ • 
spec:
  replicas: 1
      
# manifests/prometheusAdapter-deployment.yaml ์ˆ˜์ • 
spec:
  replicas: 1

๊ทธ๋ฆฌ๊ณ  setup์„ ๋จผ์ € ๋Œ๋ฆฐ๋‹ค.

kubectl create -f manifests/setup
๋”๋ณด๊ธฐ

 

# ์ถœ๋ ฅ๊ฒฐ๊ณผ

customresourcedefinition.apiextensions.k8s.io/alertmanagerconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusagents.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/scrapeconfigs.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
namespace/monitoring created

๊ทธ๋‹ค์Œ ๋‚˜๋จธ์ง€๋„ ์ƒ์„ฑํ•ด์ค€๋‹ค.

kubectl create -f manifests
๋”๋ณด๊ธฐ

# ์ถœ๋ ฅ๊ฒฐ๊ณผ


alertmanager.monitoring.coreos.com/main created
networkpolicy.networking.k8s.io/alertmanager-main created
poddisruptionbudget.policy/alertmanager-main created
prometheusrule.monitoring.coreos.com/alertmanager-main-rules created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager-main created
clusterrole.rbac.authorization.k8s.io/blackbox-exporter created
clusterrolebinding.rbac.authorization.k8s.io/blackbox-exporter created
configmap/blackbox-exporter-configuration created
deployment.apps/blackbox-exporter created
networkpolicy.networking.k8s.io/blackbox-exporter created
service/blackbox-exporter created
serviceaccount/blackbox-exporter created
servicemonitor.monitoring.coreos.com/blackbox-exporter created
secret/grafana-config created
secret/grafana-datasources created
configmap/grafana-dashboard-alertmanager-overview created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-grafana-overview created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-multicluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes-darwin created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created
networkpolicy.networking.k8s.io/grafana created
prometheusrule.monitoring.coreos.com/grafana-rules created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created
prometheusrule.monitoring.coreos.com/kube-prometheus-rules created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
networkpolicy.networking.k8s.io/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kube-state-metrics-rules created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
prometheusrule.monitoring.coreos.com/kubernetes-monitoring-rules created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
networkpolicy.networking.k8s.io/node-exporter created
prometheusrule.monitoring.coreos.com/node-exporter-rules created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
networkpolicy.networking.k8s.io/prometheus-k8s created
poddisruptionbudget.policy/prometheus-k8s created
prometheus.monitoring.coreos.com/k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-prometheus-rules created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-k8s created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
networkpolicy.networking.k8s.io/prometheus-adapter created
poddisruptionbudget.policy/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
servicemonitor.monitoring.coreos.com/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
networkpolicy.networking.k8s.io/prometheus-operator created
prometheusrule.monitoring.coreos.com/prometheus-operator-rules created
service/prometheus-operator created
serviceaccount/prometheus-operator created
servicemonitor.monitoring.coreos.com/prometheus-operator created

crd๋ฅผ ํ™•์ธํ•ด๋ณธ๋‹ค.

์„ค์น˜๋œ svc๋„ ํ™•์ธํ•ด๋ณธ๋‹ค.

ํŒŒ๋“œ๋„ ํ•œ๋ฒˆ ํ™•์ธํ•ด๋ณธ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ํฌํŠธํฌ์›Œ๋”ฉ ํ•ด์„œ prometheus ์— ์ ‘์†ํ•ด๋ณด์ž
kubectl -n monitoring port-forward svc/prometheus-k8s 9090

 

3. 4 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์˜คํผ๋ ˆ์ดํ„ฐ 

Prometheus Operator์˜ ์—ญํ• ์€ Service Discovery, ์ฆ‰, ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ๋ฆฌ์†Œ์Šค๋ฅผ ์‰ฝ๊ฒŒ ํƒ์ƒ‰ํ•˜๊ณ  ๊ฒ€์ƒ‰ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค.

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์˜ ํƒ€๊นƒ์€ ConfigํŒŒ์ผ์—์„œ ๊ด€๋ฆฌํ•œ๋‹ค, ์ด๋•Œ Configmap๋˜๋Š” ๋ณ„๋„ ํŒŒ์ผ๋กœ ๊ด€๋ฆฌํ•˜๋Š”๊ฒƒ์€ ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹ˆ๋‹ค.
 Prometheus Operator๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด Kubernetes ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์˜ ์„œ๋น„์Šค๊ฐ€ ๋™์ ์œผ๋กœ ๋ณ€๊ฒฝ๋ ๋•Œ ๋ณ€ํ™”๋ฅผ ์ž๋™์œผ๋กœ ๊ฐ์ง€ํ•˜๊ณ  Prometheus์˜ ํƒ€๊นƒ ๋ชฉ๋ก์„ ์—…๋ฐ์ดํŠธ ํ•  ์ˆ˜ ์žˆ๋‹ค.

Custom Resource Definitions (CRDs)๋ฅผ ํ†ตํ•ด ๋ชจ๋‹ˆํ„ฐ๋ง ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ ์–ธ์ ์œผ๋กœ ์ •์˜ํ•˜๋Š”๋ฐ, CRDs ์ค‘ ํ•˜๋‚˜๊ฐ€  ServiceMonitor์ด๋‹ค.

  • ServiceMonitor: ํŠน์ • ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ์„œ๋น„์Šค๋ฅผ ๋ชจ๋‹ˆํ„ฐ๋ง ๋Œ€์ƒ์œผ๋กœ ์ง€์ •ํ•˜๊ณ , Prometheus๊ฐ€ ํ•ด๋‹น ์„œ๋น„์Šค์˜ ์—”๋“œํฌ์ธํŠธ๋ฅผ ์–ด๋–ป๊ฒŒ ์Šคํฌ๋ž˜ํ•‘ํ• ์ง€ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ์ •์˜ํ•œ๋‹ค. 
  • PodMonitor: ServiceMonitor์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ, ์„œ๋น„์Šค ๋Œ€์‹  ๊ฐœ๋ณ„ ํŒŒ๋“œ๋ฅผ ์ง์ ‘ ๋ชจ๋‹ˆํ„ฐ๋ง ๋Œ€์ƒ์œผ๋กœ ์ง€์ •ํ•œ๋‹ค.
  • PrometheusRule: ๊ฒฝ๊ณ  ๊ทœ์น™์„ ์ •์˜ํ•˜์—ฌ Prometheus Alertmanager๊ฐ€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

์•„๋ž˜์—์„œ Node๋ฅผ ์œ„ํ•œ ServiceMonitor ์˜ˆ์‹œ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

# Node Exporter๋ฅผ ์œ„ํ•œ ์„œ๋น„์Šค ๋ชจ๋‹ˆํ„ฐ (์ƒ˜ํ”Œ)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 1.7.0
  name: node-exporter
  namespace: monitoring
spec:
  endpoints:      # ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์—”๋“œํฌ์ธํŠธ ๋ชฉ๋ก
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 15s
    port: https
    relabelings:   # ํƒ€๊ฒŸ์˜ ๋ผ๋ฒจ์„ ์žฌ์„ค์ •ํ•˜๋Š” ๊ทœ์น™
    - action: replace
      regex: (.*)
      replacement: $1
      sourceLabels:
      - __meta_kubernetes_pod_node_name
      targetLabel: instance
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  jobLabel: app.kubernetes.io/name   # ์ž‘์—…(job) ์ด๋ฆ„์„ ์‹๋ณ„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋Š” ๋ผ๋ฒจ
  selector:    # ๋ผ๋ฒจ ์…€๋ ‰ํ„ฐ
    matchLabels:
      app.kubernetes.io/component: exporter
      app.kubernetes.io/name: node-exporter
      app.kubernetes.io/part-of: kube-prometheus

๋ ˆ์ด๋ธ”์˜ ๋ณ€๊ฒฝ, ์ œ๊ฑฐ ๋ฐ ๊ฐœ์ˆ˜ ์ค„์ด๊ธฐ๋“ฑ์˜ ์ž‘์—…์„ ํ†ตํ•ด ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ ๊ด€๋ฆฌ๋ฅผ ์ตœ์ ํ™” ํ•  ์ˆ˜ ์žˆ๋‹ค.

  • relabel_configs:
    • ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•œ ํ›„ ๋ ˆ์ด๋ธ”์„ ๋ณ€๊ฒฝํ•œ๋‹ค.
    • ์ˆ˜์ง‘๋œ ๋ฉ”ํŠธ๋ฆญ์„ ์ฒ˜๋ฆฌํ•˜๊ณ  ํ•„์š”ํ•˜์ง€ ์•Š์€ ๋ฉ”ํŠธ๋ฆญ์„ ์ œ๊ฑฐํ•˜๊ฑฐ๋‚˜ ๋ผ๋ฒจ์„ ๋ณ€๊ฒฝํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ•œ๋‹ค. 
  • metric_relabel_configs:
    • ์ˆ˜์ง‘๋œ ๋ฉ”ํŠธ๋ฆญ์„ ์ €์žฅํ•˜๊ธฐ ์ „์— ๋ณ€ํ™˜ํ•˜๊ฑฐ๋‚˜ ์‚ญ์ œํ•œ๋‹ค.
    • ์Šคํฌ๋ž˜ํ•‘ํ•  ๋ฉ”ํŠธ๋ฆญ์„ ์„ ํƒํ•˜๊ฑฐ๋‚˜, ํƒ€๊ฒŸ์„ ์žฌ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•œ๋‹ค.
๋”๋ณด๊ธฐ
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 1.7.0
  name: node-exporter
  namespace: monitoring
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 15s
    port: https
    relabelings:
    - action: replace
      regex: (.*)
      replacement: $1
      sourceLabels:
      - __meta_kubernetes_pod_node_name
      targetLabel: instance
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
    # ์ถ”๊ฐ€๋œ relabel_configs ์˜ˆ์‹œ
    relabel_configs:
    - sourceLabels: [__address__]      # ์›๋ณธ๋ผ๋ฒจ
      targetLabel: instance_address    # ์ตœ์ข…๋ผ๋ฒจ 
    - targetLabel: __address__         # ์›๋ณธ๋ผ๋ฒจ
      replacement: 127.0.0.1:9090      # ๊ณ ์ •๋ผ๋ฒจ๋กœ ์ˆ˜์ •
    # ์ถ”๊ฐ€๋œ metric_relabel_configs ์˜ˆ์‹œ
    metric_relabel_configs:
    - sourceLabels: [job]
      regex: 'node-exporter'
      action: drop
  jobLabel: app.kubernetes.io/name
  selector:
    matchLabels:
      app.kubernetes.io/component: exporter
      app.kubernetes.io/name: node-exporter
      app.kubernetes.io/part-of: kube-prometheus

 

 

3. 5 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์˜คํ† ์Šค์ผ€์ผ๋ง

 

3.5.1 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์–ด๋Žํ„ฐ

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์–ด๋Žํ„ฐ๋Š” Prometheus์—์„œ ์ˆ˜์ง‘ํ•œ ๋ฉ”ํŠธ๋ฆญ์„ ์‚ฌ์šฉํ•˜์—ฌ Kubernetes์˜ ์ž๋™ ์Šค์ผ€์ผ๋ง ๊ธฐ๋Šฅ(์˜ˆ: Horizontal Pod Autoscaler, HPA)๊ณผ ์—ฐ๋™ํ•˜๋Š” ์—ญํ• ์„ํ•œ๋‹ค.

์ด ์ฑ…์—์„œ๋Š” ์ž์„ธํžˆ ์†Œ๊ฐœํ•˜์ง€๋งŒ, ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์–ด๋Žํ„ฐ๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•˜์ง€๋„ ์•Š๊ณ  ์ด ์ฑ…์—์„œ๋„ KEDA๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์ถ”์ฒœํ•˜๊ณ ์žˆ์–ด ๊ฐ„๋‹จํžˆ๋งŒ ํ™•์ธํ•˜๊ณ  ๋„˜์–ด๊ฐ„๋‹ค.

Prometheus Adapter์€ ConfigMap์— ์ •์˜๋œ ๋ฐ๋กœ ๋™์ž‘ํ•˜๋ฉฐ, Prometheus Adapter Deployment์— ๋งˆ์šดํŠธ ๋œ๋‹ค.
์•„๋ž˜ ์„ค์ •ํŒŒ์ผ์—์„œ๋Š” ์–ด๋–ค Prometheus Metric์„ ์‚ฌ์šฉํ• ๊ฒƒ์ธ์ง€, HPA์—์„œ ์–ด๋–ป๊ฒŒ ์ฐธ์กฐํ•˜๋Š”์ง€ ๋ณผ ์ˆ˜์žˆ๋‹ค.

# http_requests_total ๋ฉ”ํŠธ๋ฆญ์— ๋Œ€ํ•œ 2๋ถ„ ๋™์•ˆ์˜ ๋น„์œจ(rate)์„ ๊ณ„์‚ฐํ•˜์—ฌ,
# http_requests_per_second๋ผ๋Š” ๋ฉ”ํŠธ๋ฆญ ์ด๋ฆ„์œผ๋กœ ์ œ๊ณต
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-adapter-config
  namespace: custom-metrics
data:
  config.yaml: |
    rules:
      - seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
        resources:
          overrides:
            namespace: {resource: "namespace"}
            pod: {resource: "pod"}
        name:
          matches: "^(.*)$"
          as: "http_requests_per_second"
        metricsQuery: 'sum(rate(http_requests_total{namespace="my-ns",job="web-app"}[2m])) by (pod)'

---
# http_requests_per_second ๋ฉ”ํŠธ๋ฆญ์ด 100์„ ์ดˆ๊ณผํ•˜๋Š” ๊ฒฝ์šฐ
# Pod์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๋„๋ก ์„ค์ •
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
  namespace: my-ns
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: your_deployment_name
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: 100

 

3.5.2 KEDA ์˜คํ† ์Šค์ผ€์ผ

์œ„์™€ ๊ฐ™์€ ์กฐ๊ฑด์œผ๋กœ ๋™์ž‘ํ•˜๋Š” KEDA ScaledObject ์ •์˜ํŒŒ์ผ์ด๋‹ค.
๊ฐ™์€ ์กฐ๊ฑด์ธ๋ฐ, HPA๋ฅผ ๋”ฐ๋กœ ์„ค์ •ํ•˜์ง€ ์•Š๊ณ , ๋ฉ”ํŠธ๋ฆญ ๋ณ€ํ™˜์„ ์œ„ํ•œ ConfigMap์„ค์ •๋„ ์ˆ˜์ •ํ•˜์ง€ ์•Š๊ณ , ScaledObject ํ•˜๋‚˜๋งŒ ์ƒ์„ฑํ•˜๋ฉด ๋˜๋‹ˆ ๋งค์šฐ ๊ฐ„๋‹จํ•˜๋‹ค.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaledobject
  namespace: my-ns
spec: 
  scaleTargetRef:       # ์Šค์ผ€์ผ๋งํ•  ๋Œ€์ƒ ๋ฆฌ์†Œ์Šค
    name: your_deployment_name
  pollingInterval: 30   # ํด๋ง ๊ฐ„๊ฒฉ์„ ์ดˆ ๋‹จ์œ„๋กœ ์„ค์ • (๊ธฐ๋ณธ๊ฐ’์€ 30์ดˆ)
  cooldownPeriod:  300  # ์Šค์ผ€์ผ ๋‹ค์šด ์ „ ์ฟจ๋‹ค์šด ์‹œ๊ฐ„์„ ์ดˆ ๋‹จ์œ„๋กœ ์„ค์ • (๊ธฐ๋ณธ๊ฐ’์€ 300์ดˆ)
  minReplicas: 1        # ์ตœ์†Œ Replica ์ˆ˜
  maxReplicas: 10       # ์ตœ๋Œ€ Replica ์ˆ˜
  triggers:             # ํŠธ๋ฆฌ๊ฑฐ ์กฐ๊ฑด ์ง€์ •
    - type: prometheus
      metadata:
        serverAddress: http://prometheus.monitoring
        metricName: http_requests_per_second
        threshold: '100'
        query: |
          sum(rate(http_requests_total{namespace="my-ns",job="web-app"}[2m])) by (pod)

 

3. 6 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์•Œ๋žŒ

์ด ์ฑ…์—์„œ๋Š” Rule Manager, Alert manager ์ปดํฌ๋„ŒํŠธ๊ฐ€ ๋ชจ๋‘ ๋…๋ฆฝ์ ์ธ ์„œ๋น„์Šค๋กœ ์‹คํ–‰๋˜๋Š”๊ฒƒ์ฒ˜๋Ÿผ ์„ค๋ช…ํ•˜์ง€๋งŒ, ์ตœ์‹ ๋ฒ„์ „(2.50)์—์„œ๋Š” Rule Manager์˜ ์—ญํ• ์ด ALERTING RULES ๊ธฐ๋Šฅ์œผ๋กœ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„ ๋‚ด๋ถ€์— ๋‚ด์ œ ๋˜์–ด์žˆ๋Š”๊ฒƒ์œผ๋กœ ๋ณด์ธ๋‹ค.

  • Prometheus์—์„œ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋‘๊ฐ€์ง€ ์œ ํ˜•์˜ ๊ทœ์น™
    • recording_rules
      • ๋ณต์žกํ•˜๊ฑฐ๋‚˜ ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” ์ฟผ๋ฆฌ์˜ ๊ฒฐ๊ณผ๋ฅผ ์ƒˆ๋กœ์šด ์‹œ๊ณ„์—ด๋กœ ์ €์žฅํ•˜์—ฌ, ์ดํ›„ ์ฟผ๋ฆฌ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , ์ฟผ๋ฆฌ๋ฅผ ๋‹จ์ˆœํ™”ํ•œ๋‹ค.
      • ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๋ฐ ์งˆ์˜์‘๋‹ต ์‹œ๊ฐ„์„ ๊ฐœ์„ ํ•œ๋‹ค.
    • alerting_rules
      • ํŠน์ •์กฐ๊ฑด์„ ๋งŒ์กฑํ• ๋•Œ  ์™ธ๋ถ€ ์„œ๋น„์Šค์— ๊ฒฝ๊ณ  ์‹คํ–‰์— ๋Œ€ํ•œ ์•Œ๋ฆผ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ๋‹ค.
      • Prometheus ์ฟผ๋ฆฌ ์–ธ์–ด(PromQL)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •์˜๋˜๋ฉฐ, ์กฐ๊ฑด์ด ์ฐธ์ด ๋˜๋ฉด ๊ฒฝ๊ณ ์ƒํƒœ๊ฐ€ ๋˜์–ด AlertManager๋กœ ์ „์†ก๋œ๋‹ค.
      • ๊ฒฝ๊ณ  ์กฐ๊ฑด์˜ ํ‰๊ฐ€ ์ž์ฒด๋Š” Prometheus์— ์ €์žฅ๋œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ, "๊ฒฝ๊ณ ์˜ ์ƒํƒœ"๋‚˜ "๊ฒฝ๊ณ ๊ฐ€ ๋ฐœ์ƒํ–ˆ๋‹ค๋Š” ์‚ฌ์‹ค"์ด Prometheus์˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ๋ณ„๋„์˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋กœ ์ €์žฅ๋˜์ง€ ์•Š๋Š”๋‹ค. - ์ฐธ๊ณ 1, ์ฐธ๊ณ 2
      • Prometheus์—์„œ ์•Œ๋žŒ ์ด๋ ฅ(history)๋ฅผ ์ €์žฅํ•˜๊ณ  ์žˆ์ง€ ์•Š๋Š”๊ฒƒ ๊ฐ™์€๋ฐ ์ด ์ฑ…์—์„œ๋Š” ๋ณ„๋„์˜ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๋กœ ์ €์žฅํ•œ๊ฒƒ์œผ๋กœ ๋‚˜์™€์žˆ๋‹ค(163-164p).  Grafana์—์„œ ์•Œ๋žŒ์„ ๋ฐœ์ƒ์‹œํ‚จ๊ฒฝ์šฐ Grafana์˜ RDB์— Alart์ด๋ ฅ์ด ์ €์žฅ๋˜์–ด์žˆ์—ˆ๋‹ค. (๋งž๋‚˜?)
  • Alert Manager
    • Prometheus ์„œ๋ฒ„๋Š” ๊ฒฝ๊ณ  ๊ทœ์น™์— ๋”ฐ๋ผ ๋ฐœ์ƒํ•œ ๊ฒฝ๊ณ ๋ฅผ Alertmanager์— ์ „์†กํ•˜๊ณ , Alertmanager๋Š” ์ด๋Ÿฌํ•œ ๊ฒฝ๊ณ ๋ฅผ ์ฒ˜๋ฆฌํ•˜์—ฌ ์ตœ์ข… ์‚ฌ์šฉ์ž์—๊ฒŒ ์•Œ๋ฆฐ๋‹ค.
# Prometheus ์•Œ๋žŒ ๊ทœ์น™
#  ํ‰๊ท ์š”์ฒญ ๋Œ€๊ธฐ ์‹œ๊ฐ„์ด 5๋ถ„๋™์•ˆ 0.5์ดˆ๋ณด๋‹ค ํฐ "HighRequestLatency"์•Œ๋žŒ ์ •์˜
groups:
- name: HighRequestLatencyAbove0.5s
  rules:
  - alert: HighRequestLatency
    expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
    for: 10m
    labels:
      severity: 2
      name: HighRequestLatencyAbove0.5s
    annotations:
      summary: High request latency
- name: HighRequestLatencyAbove1s
  rules:
  - alert: HighRequestLatency
    expr: job:request_latency_seconds:mean5m{job="myjob"} > 1
    for: 10m
    labels:
      severity: 1
      name: HighRequestLatencyAbove1s
    annotations:
      summary: High request latency

# AlertManager ๋ผ์šฐํŒ… ๊ทœ์น™
route:
  receiver: 'devops-team'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: ['job', 'severity']
  routes:
    - match:
        name: 'HighRequestLatencyAbove1s'
        severity: '1'
      receiver: 'devops-team'

  receivers:
  - name: 'devops-team'
    email_configs:
    - to: 'devops@example.com'
      from: 'alertmanager@example.com'
      smarthost: 'smtp.example.com:25'

๋˜, ์ด์ „์— ์•Œ๋žŒ์„ ์„ค์ •ํ•˜๋ฉด์„œ Prometheus ์•Œ๋žŒ๊ณผ Grafana ์•Œ๋žŒ์ด ๋ญ๊ฐ€ ๋‹ค๋ฅธ๊ฑด์ง€ ๊ถ๊ธˆํ–ˆ๋Š”๋ฐ, ์ด ์ฑ…์—์„œ๋Š” ์ด ๋‘˜์€ ์ฐจ์ด๊ฐ€ ์—†๊ณ  ๋™์ผํ•œ ์•Œ๋žŒ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค๊ณ  ๋˜์–ด์žˆ๋‹ค. (๊ทธ๋ผํŒŒ๋‚˜ ์„œ๋ฒ„๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ๊ฐˆ๋ผ์ ธ์„œ ๋‚˜์™€ ๊ฐœ๋ฐœ๋œ ์˜คํ”ˆ์†Œ์Šค์ด๋ฉฐ, ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๊ฐœ๋ฐœ ๋œ ์•Œ๋žŒ ๊ทœ์น™์€ ๊ทธ๋ผํŒŒ๋‚˜ ์„œ๋ฒ„์—์„œ๋„ ๋™์ผํ•˜๊ฒŒ ์ž‘๋™ํ•œ๋‹ค.)

๊ณ ๋ คํ•ด์•ผํ•  ์‚ฌํ•ญ

  • ๋กœ๊ทธ์˜ ๊ฒฝ์šฐ ๋ ˆ์ฝ”๋”ฉ ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•˜๊ณ  ๋ฉ”ํŠธ๋ฆญ ํ˜•์‹์œผ๋กœ ์ ์žฌํ•˜๊ณ  ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•œ๋‹ค.
  • ์ถ”์ ์˜ ๊ฒฝ์šฐ์—๋Š” ์ง์ ‘ ๋ ˆ์ฝ”๋”ฉ ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์—†์œผ๋ฏ€๋กœ, ์ŠคํŒฌ ๋ฉ”ํŠธ๋ฆญ์„ ์‚ฌ์šฉํ•ด์„œ ๋ฉ”ํŠธ๋ฆญ(๋ฒ„ํ‚ท)์„ ์ƒ์„ฑํ•˜๊ณ  ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•œ๋‹ค.
  • ํ”„๋กœํŒŒ์ผ์€ ๋ ˆ์ฝ”๋”ฉ ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์—†์œผ๋ฉฐ, ํ”„๋กœํŒŒ์ผ์„ ๋ฉ”ํŠธ๋ฆญ์œผ๋กœ ๋ฐ˜์ถœ(export)ํ•˜๊ณ  ๊ทœ์น™์„ ๊ฐœ๋ฐœํ•œ๋‹ค.
  • ์•Œ๋žŒ๊ณผ ํ†ต์ง€๋Š” ์ƒํƒœ๋ฅผ ๊ฐ–๋Š”๋‹ค. ์•Œ๋žŒ์„ ์ƒ์„ฑํ•˜๊ณ  ํ†ต์ง€ํ•˜๋Š” ๋ผ์ดํ”„์‚ฌ์ดํด์„ ์„ค๊ณ„ํ•˜๊ณ  ์ƒํƒœ๋ฅผ ๊ด€๋ฆฌํ•ด์•ผํ•œ๋‹ค.

์ฃผ์˜์‚ฌํ•ญ

  • ๋‹ค์–‘ํ•œ ๊ธฐ์ˆ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ถ”ํ›„ ๊ด€๋ฆฌ๊ฐ€ ์–ด๋ ต๊ธฐ๋•Œ๋ฌธ์— ์•Œ๋ฆผ ๊ทœ์น™์„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋กœ ๋‹จ์ผํ™” ํ•œ๋‹ค.
  • ๊ฐ€๊ธ‰์  ๋ฉ”ํŠธ๋ฆญ์œผ๋กœ๋งŒ ์•Œ๋žŒ์„ ๊ด€๋ฆฌํ•˜๊ณ , ์Šคํ…Œ์ดํŠธ ํƒ€์ž„๋ผ์ธ์ฐจํŠธ๋กœ ์‹œ๊ฐํ™”ํ•œ๋‹ค.
  • SLO์ง€ํ‘œ์™€ ์—ฐ๊ณ„ํ•œ๋‹ค. 

 

3. 7 ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ƒค๋”ฉ ์•„ํ‚คํ…์ณ

ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ๊ตฌ์„ฑ์„ ์œ„ํ•œ ๊ธฐ๋ณธ์ ์ธ ๊ฐ€์ด๋“œ๋ผ์ธ์€ ๋†’์€ ์นด๋””๋„๋ฆฌํ‹ฐ ๋ฉ”ํŠธ๋ฆญ ์ƒ์„ฑ(label userid)์„ ํ”ผํ•˜๋Š”๊ฒƒ (์ฐธ๊ณ : ์นด๋””๋„๋ฆฌํ‹ฐ ์ค„์ด๊ธฐ ๋ธ”๋กœ๊ทธ)๊ทธ๋ฆฌ๊ณ  ์ƒค๋”ฉ์„ ๊ตฌ์„ฑํ•˜๋Š”๊ฒƒ์ด๋‹ค.

Prometheus๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋‹จ์ผ ์ธ์Šคํ„ด์Šค๋กœ ์šด์˜๋˜๋ฉฐ ๋กœ์ปฌ๋””์Šคํฌ์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•œ๋‹ค.
์ด๋Š” ๋Œ€๊ทœ๋ชจ ํ™˜๊ฒฝ์—์„œ๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ์ด๋•Œ ํ•„์š”ํ•œ๊ฒƒ์ด ์ƒค๋”ฉ๊ณผ ํŽ˜๋”๋ ˆ์ด์…˜ ์ด๋‹ค.

3.7.1 ์ƒค๋”ฉ ์•„ํ‚คํ…์ณ

์ƒค๋”ฉ์ด๋ž€, ์Šคํฌ๋ž˜ํ•‘ ํƒ€๊นƒ ๋ชฉ๋ก์„ 2๊ฐœ ์ด์ƒ์˜ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋กœ ๋ถ„ํ• ํ•˜๋Š”๊ฒƒ์„ ๋œปํ•˜๋ฉฐ,
๊ตฌ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋”ฐ๋ผ ์ˆ˜์ง์ƒค๋”ฉ, ์ˆ˜ํ‰ ์ƒค๋”ฉ ๋‘๊ฐ€์ง€๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค.

  • ์ˆ˜์ง์ƒค๋”ฉ:
    • ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ๋ฅผ ์šด์˜ํ•˜๋ฉฐ, ํŠน์ • ๊ธฐ์ค€(์˜ˆ: ์กฐ์ง๋ณ„, ํŒ€๋ณ„)์— ๋”ฐ๋ผ ์„œ๋ฒ„๋ฅผ ๋ถ„ํ•  ํ•œ๋‹ค.
    • ๊ฐ ์„œ๋ฒ„๋Š” ๋…๋ฆฝ์ ์œผ๋กœ ํŠน์ • ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•˜๊ณ  ๊ด€๋ฆฌํ•œ๋‹ค.
  • ์ˆ˜ํ‰์ƒค๋”ฉ: (์ดํ•ด๋ชปํ•จ)
    • ์ƒค๋“œ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•œ๋‹ค.
    • ํ•˜๋‚˜์˜ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„๊ฐ€ ์—ฌ๋Ÿฌ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ–๋Š”๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.

3.7.2 ํŽ˜๋”๋ ˆ์ด์…˜ ์•„ํ‚คํ…์ณ

ํŽ˜๋”๋ ˆ์ด์…˜์ด๋ž€, ์—ฌ๋Ÿฌ Prometheus ์„œ๋ฒ„๋“ค ์‚ฌ์ด์—์„œ ๋ฉ”ํŠธ๋ฆญ์„ ์ง‘๊ณ„ํ•˜๊ณ  ๊ด€๋ฆฌํ•˜๋Š” ๊ตฌ์กฐ๋ฅผ ๋งํ•œ๋‹ค.

  • ๊ณ„์ธต์  ํŽ˜๋”๋ ˆ์ด์…˜
    • ์ƒ์œ„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„ ์•„๋ž˜์— ์—ฌ๋Ÿฌ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๊ฐ€ ์กด์žฌํ•˜๋Š” ์•„ํ‚คํ…์ณ์ด๋‹ค.
    • ํ•˜์œ„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๊ฐ€ ์ˆ˜์ง‘ํ•˜๋Š” ๋ฉ”ํŠธ๋ฆญ์„ ์ƒ์œ„ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์—์„œ ๋‹ค์‹œ ์Šคํฌ๋žฉํ•˜์—ฌ ์ „์ฒด์ ์ธ ๋‚ด์šฉ์„ ์ƒ์œ„๋ ˆ๋ฒจ์—์„œ ์ทจํ•ฉํ•˜์—ฌ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค.
  • ๋™์ผ ๋ ˆ๋ฒจ ํŽ˜๋”๋ ˆ์ด์…˜ (๊ต์ฐจ์„œ๋น„์Šค)
    • ํ•œ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋ฒ„์—์„œ ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์„œ๋น„์Šค๋ฅผ ์Šคํฌ๋žฉ ํ•˜๋„๋ก ๊ตฌ์„ฑํ•œ๋‹ค.
    • ๋‹จ์ผ์„œ๋ฒ„์—์„œ ๋‘ ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์ฟผ๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•ด์ง„๋‹ค.

 

์œ„์—์„œ ์„ค๋ช…ํ•œ ์ƒค๋”ฉ๊ณผ ํŽ˜๋”๋ ˆ์ด์…˜ ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•˜์—ฌ Prometheus ํด๋Ÿฌ์Šคํ„ฐ๋ง ๊ตฌ์„ฑ์„ ๊ตฌํ˜„ํ•œ๊ฒƒ์ด ๋ฐ”๋กœ Thanos, Mimir์ด๋‹ค.
Prometheus์˜ ํ™•์žฅ์„ฑ๊ณผ ์žฅ๊ธฐ ๋ฐ์ดํ„ฐ ์ €์žฅ ๋ฐ ๊ณ ๊ฐ€์šฉ์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋„๋ก ์„ค๊ณ„๋˜์–ด์žˆ์œผ๋ฉฐ ๋ฐ”๋กœ ๋’ค์—์„œ Thanos, ๊ทธ๋ฆฌ๊ณ  ์•„๋งˆ๋„ 4์žฅ์—์„œ Mimir๋ฅผ ์–ธ๊ธ‰ํ•˜๊ฒŒ ๋œ๋‹ค.

 

 

3. 8 ํƒ€๋…ธ์Šค ์šด์˜

 

3.8.1 ํƒ€๋…ธ์Šค ์šด์˜

๋‹จ์ผ Prometheus๋Š” ๊ณ„์† ๋ฐ˜๋ณตํ•ด์„œ ๋งํ•˜์ง€๋งŒ, ํ™•์žฅ์„ฑ(์ˆ˜ํ‰ํ™•์žฅ ๋ถˆ๊ฐ€)๊ณผ ์žฅ๊ธฐ ๋ฐ์ดํ„ฐ ์ €์žฅ ๋ฐ ๊ณ ๊ฐ€์šฉ์„ฑ ๋ฌธ์ œ๊ฐ€ ์กด์žฌํ•œ๋‹ค.
์ด๋Ÿฐ ๋ฌธ์ œ๋Š” Thanos๋ฅผ ๋„์ž…ํ•จ์œผ๋กœ์จ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค.

  • Thanos ์žฅ์  (=์ปดํฌ๋„ŒํŠธ)
    • ๊ธ€๋กœ๋ฒŒ ๋ทฐ์—ฌ๋Ÿฌ Prometheus์„œ๋ฒ„์—์„œ ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ์ธ์Šคํ„ด์Šค์—์„œ ์ฟผ๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค. (Thanos Querier)
    • ๋‹ค์šด์ƒ˜ํ”Œ๋ง: ๋‹ค์šด์ƒ˜ํ”Œ๋ง์„ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•˜์—ฌ ์˜ค๋žœ๊ธฐ๊ฐ„์— ๊ฑธ์นœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฟผ๋ฆฌํ•˜๋Š”๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. (Thanos compactor)
    • ๊ทœ์น™: ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ƒค๋“œ์˜ ๋ฉ”ํŠธ๋ฆญ์„ ํ˜ผํ•ฉํ•˜๋Š” ๊ธ€๋กœ๋ฒŒ ์•Œ๋žŒ ๊ทœ์น™๊ณผ ๋ ˆ์ฝ”๋”ฉ ๊ทœ์น™์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. (Thanos Ruler)
    • ์žฅ๊ธฐ๋ณด๊ด€: ๊ฐ์ฒด ์Šคํ† ๋ฆฌ์ง€๋ฅผ ํ™œ์šฉํ•ด ์Šคํ† ๋ฆฌ์ง€์˜ ๋‚ด๊ตฌ์„ฑ, ์‹ ๋ขฐ์„ฑ, ํ™•์žฅ์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค. (Thanos StoreGateway)

 

Thanos๋Š” ์‚ฌ์ด๋“œ์นด ํ˜น์€ ๋ฆฌ์‹œ๋ฒ„ ๋ฐฉ์‹ ์ด ๋‘๊ฐ€์ง€ ์œ ํ˜•์œผ๋กœ ์šด์˜๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ ,์•„๋ž˜์—์„œ ์•Œ์•„๋ณด๋„๋ก ํ•œ๋‹ค.
์ถ”๊ฐ€์ ์œผ๋กœ ์ด ๋ธ”๋กœ๊ทธ์˜ ๊ธ€์ด ์‚ฌ์ด๋“œํ‚ค ๋ฐฉ์‹๊ณผ ๋ฆฌ์‹œ๋ฒ„ ๋ฐฉ์‹์˜ ์ฐจ์ด์ ์„ ์ž˜ ์„ค๋ช…ํ•ด์ค€๊ฒƒ ๊ฐ™๋‹ค.

 

3.8.2 ํƒ€๋…ธ์Šค ์‚ฌ์ด๋“œ์นด ๋ฐฉ์‹

Thanos Sidecar ๋ฐฉ์‹์€ ์—ฌ๋Ÿฌ Prometheus ์„œ๋ฒ„์— Thanos์‚ฌ์ด๋“œ์นด ์ปจํ…Œ์ด๋„ˆ๋ฅผ ํ•จ๊ป˜ ์šด์˜ํ•˜๋Š” ์•„ํ‚คํ…์ณ๋ฅผ ๋งํ•œ๋‹ค.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Prometheus โ”‚ Sidecar โ”‚   ...   โ”‚ Prometheus โ”‚ Sidecar โ”‚     โ”‚   Rule  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜     โ””โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚                                โ”‚           โ”‚
                Blocks                           Blocks      Blocks
                  โ”‚                                โ”‚           โ”‚
                  v                                v           v
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚                   Object Storage                 โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
# https://thanos.io/tip/thanos/design.md/

์ฑ…์—์„œ ๋‚˜์˜ค๋Š” ์›๊ฒฉ์ฝ๊ธฐ์— ๋Œ€ํ•ด ์ดํ•ดํ•˜์ง€ ๋ชปํ–ˆ๋‹ค.
์‚ฌ์ด๋“œ์นด๋ฐฉ์‹์€ ์›๊ฒฉ์ฝ๊ธฐ, ๋ฆฌ์‹œ๋ฒ„ ๋ฐฉ์‹์€ ์›๊ฒฉ์“ฐ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•œ๋‹ค.

  • ์“ฐ๊ธฐ:
    • ํƒ€๋…ธ์Šค ์‚ฌ์ด๋“œ์นด๋Š” ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค ์ธ์Šคํ„ด์Šค์™€ ์—ฐ๊ณ„ํ•ด์„œ ๋ฉ”ํŠธ๋ฆญ์˜ ์ˆ˜์ง‘์„ ์ฒ˜๋ฆฌํ•˜๊ณ , ์ ์žฌ๋œ TSDB๋ธ”๋ก์„ ๊ฐ์ฒด ์Šคํ† ๋ฆฌ์ง€๋กœ ์ „๋‹ฌํ•œ๋‹ค.
    • ์“ฐ๊ธฐ์˜ ํ•ต์‹ฌ ์ปดํฌ๋„ŒํŠธ๋Š” ์‚ฌ์ด๋“œ์นด์ด๋‹ค.
    • ์‚ฌ์ด๋“œ์นด๋Š” Prometheus์˜ ๋กœ์ปฌ ์Šคํ† ๋ฆฌ์ง€์—์„œ ์ฝ์œผ๋ฏ€๋กœ TSDB์— ์ถ”๊ฐ€ ๋กœ์ปฌ ์Šคํ† ๋ฆฌ์ง€(PV)๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๋‹ค.
    • ๊ธฐ๋ก ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ์ฒด ์Šคํ† ๋ฆฌ์ง€๋ฅผ ํ†ตํ•ด ๋‚ด๊ตฌ์„ฑ ์žˆ๊ณ  ์ฟผ๋ฆฌ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ง€๋Š” ๋™์•ˆ 2์‹œ๊ฐ„๋งˆ๋‹ค ์—…๋กœ๋“œ๋˜๋ฏ€๋กœ Prometheus ๋กœ์ปฌ ์Šคํ† ๋ฆฌ์ง€์˜ TSDB ๋ณด์กด ์‹œ๊ฐ„์ด ๋‹จ์ถ•๋œ๋‹ค.
  • ์ฝ๊ธฐ:
    • ํƒ€๋…ธ์Šค ์Šคํ† ์–ด๋Š” ์Šคํ† ์–ด API๋ฅผ ์ด์šฉํ•ด ๋ธ”๋ก์„ ๊ฒ€์ƒ‰ํ•˜๊ณ , ๊ฒฐ๊ณผ๋ฅผ ํƒ€๋…ธ์Šค ์ฟผ๋ฆฌ์— ๋ฐ˜ํ™˜ํ•œ๋‹ค.
    • ์ฝ๊ธฐ์˜ ํ•ต์‹ฌ์€ ํƒ€๋…ธ์Šค ์Šคํ† ์–ด์™€ ์‚ฌ์ด๋“œ์นด ์ด๋‹ค.
    • ์ฟผ๋ฆฌ๊ฐ€ ๋ฐœ์ƒํ–ˆ์„๋•Œ ํƒ€๋…ธ์Šค ์Šคํ† ์–ด์™€ ์‚ฌ์ด๋“œ์นด ์–‘์ชฝ์œผ๋กœ ์กฐํšŒ๋ฅผ ์š”์ฒญํ•œ๋‹ค. ์‚ฌ์ด๋“œ์นด๋Š” ๋ธ”๋ก์ƒ์„ฑ ์ด์ „ ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์กฐํšŒํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ•˜๊ณ , ํƒ€๋…ธ์Šค์Šคํ† ์–ด๋Š” ๋ธ”๋ก ์ƒ์„ฑ ์ดํ›„ ๊ฐ์ฒด ์Šคํ† ๋ฆฌ์ง€์— ์žˆ๋Š”๋ฐ์ดํ„ฐ๋ฅผ ์กฐํšŒํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ•œ๋‹ค.

 

https://github.com/thanos-io/thanos/blob/main/docs/quick-tutorial.md

 

3.8.3 ํƒ€๋…ธ์Šค ๋ฆฌ์‹œ๋ฒ„ ๋ฐฉ์‹

์“ฐ๊ธฐ๋ฅผ ์ฒ˜๋ฆฌํ•ด์ฃผ๋Š” ๋ณ„๋„์˜ Receiver๊ฐ€ ์กด์žฌํ•˜๋Š” ์•„ํ‚คํ…์ณ ์ด๋‹ค.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Prometheus  โ”‚   ...   โ”‚ Prometheus  โ”‚     โ”‚  Rule   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
        v                       v               v
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Thanos        โ”‚   โ”‚  Thanos        โ”‚   โ”‚  Thanos        โ”‚
โ”‚  Receiver      โ”‚   โ”‚  Receiver      โ”‚   โ”‚  Receiver      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                   โ”‚
                    โ”‚                               โ”‚
                    v                               v
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚                Object Storage                โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  • ์“ฐ๊ธฐ
    • ์œ„์˜ ์‚ฌ์ด๋“œ์นด ๋ฐฉ์‹์—์„œ Prometheus TSDB๋ธ”๋ก์ด ๋ฐ”๋กœ ObjectStorage์— ์“ฐ์˜€๋‹ค๋ฉด,
      ๋ฆฌ์‹œ๋ฒ„ ๋ฐฉ์‹์€ Prometheus์—์„œ ์ง€์†์ ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ Reciver์—๊ฒŒ ๋ณด๋‚ด๊ณ , TSDB๋ธ”๋ก์€ Reciver์— ์˜ํ•ด์„œ ์˜ค๋ธŒ์ ํŠธ ์Šคํ† ๋ฆฌ์ง€์— ์ €์žฅ๋œ๋‹ค.
    • Reciver StatefulSet์ด ๋ฐฐํฌ๋˜์–ด์•ผ ํ•œ๋‹ค. (+PV)
  • ์ฝ๊ธฐ
    • ์ •ํ™•ํ•œ ๋ถ€๋ถ„์€ ๋‚ด์šฉ์ด ์—†์ง€๋งŒ, ์•„ํ‚คํ…์ณ๋‚˜ ๋‹ค๋ฅธ ๋ฌธ์„œ๋ฅผ ํ™•์ธํ•ด๋ดค์„๋•Œ ๋™์ž‘ ๋ฐฉ์‹์€ ๊ฑฐ์˜ ์‚ฌ์ด๋“œ์นด์™€ ๋น„์Šทํ•œ๊ฒƒ ๊ฐ™๋‹ค.
    • ๋‹ค๋งŒ, ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋Š” ์ตœ๊ทผ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋Š” ์œ„์น˜๊ฐ€ Thanos Reciver๊ฐ€ ๋œ๋‹ค.

https://github.com/thanos-io/thanos/blob/main/docs/quick-tutorial.md

 

3.8.4 ํƒ€๋…ธ์Šค ๊ตฌ์„ฑ

 

Step.1  minio ์Šคํ† ๋ฆฌ์ง€ ๊ตฌ์„ฑ

์Šคํ† ๋ฆฌ์ง€๋กœ ์‚ฌ์šฉํ•  minio๋ฅผ ๋‹ค์šด๋ฐ›๊ณ  ์‹คํ–‰์‹œํ‚จ๋‹ค. 

# ์ฐธ๊ณ : https://min.io/download#/macos

# ์„œ๋ฒ„ ๋‹ค์šด๋กœ๋“œ 
curl --progress-bar -O https://dl.min.io/server/minio/release/darwin-arm64/minio

# minio๊ฐ€ ์‚ฌ์šฉํ•  ๋””๋ ‰ํ† ๋ฆฌ ์ƒ์„ฑ
mkdir minio-dir

# minio ์‹คํ–‰
chmod +x minio
MINIO_ROOT_USER=admin MINIO_ROOT_PASSWORD=password ./minio server ./minio-dir --console-address ":9001"

์ด๋ ‡๊ฒŒ ์‹คํ–‰์‹œํ‚ค๋ฉด ๋กœ์ปฌ์—์„œ ๋ธŒ๋ผ์šฐ์ €๋กœ ์ ‘์†ํ•  ์ˆ˜ ์žˆ๋‹ค. (http://127.0.0.1:9001)

์‚ฌ์šฉํ•  ๋ฒ„ํ‚ท์„ ํ•˜๋‚˜ ๋งŒ๋“ค์–ด์ค€๋‹ค.

 

Step.2  Prometheus ์‹คํ–‰

๋จผ์ €, ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๊ฐ€ ๋ฉ”ํŠธ๋ฆญ์„ ๊ฐ€์ ธ๊ฐ€์•ผ ํ•˜๊ธฐ๋•Œ๋ฌธ์—, ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ node-exporter๋ฅผ ์„ค์น˜ํ•˜์—ฌ ์ˆ˜์ง‘ํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”ํŠธ๋ฆญ์„ ๋งŒ๋“ค์–ด์ฃผ์ž

brew install node_exporter
brew services start node_exporter

# ์‹คํ–‰ ํ›„ ํ™•์ธ ๋ฐฉ๋ฒ• curl http://localhost:9100/metrics
# ํ…Œ์ŠคํŠธ ์ข…๋ฃŒ ํ›„ brew services stop node_exporter && brew uninstall node_exporter ๋ช…๋ น์–ด๋กœ ์‚ญ์ œ

์ด์ œ ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋ฅผ ์‹คํ–‰ํ•ด๋ณด์ž.
์œ„์—์„œ ๋งŒ๋“ค์–ด๋†“์€ node_exporter์˜ ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•˜๋„๋ก ํ•œ๋‹ค.

# ๋ฐ”์ด๋„ˆ๋ฆฌ ๋‹ค์šด๋กœ๋“œ
wget https://github.com/prometheus/prometheus/releases/download/v2.50.1/prometheus-2.50.1.darwin-arm64.tar.gz

# ์••์ถ•ํ•ด์ œ
tar zxvf prometheus-2.50.1.darwin-arm64.tar.gz

# ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค๋ฅผ ์œ„ํ•œ ๋””๋ ‰ํ† ๋ฆฌ ์ƒ์„ฑ
mkdir prometheus-tsdb

# prometheus.ymlํŒŒ์ผ์— thanos๋ฅผ ์œ„ํ•œ ๋ผ๋ฒจ ์„ค์ • ์ถ”๊ฐ€
cd prometheus-2.50.1.darwin-arm64
vi prometheus.yml
----------------------------------------
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    cluster: "test-cluster"
    environment: "local-test"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "local-node-exporter"
    static_configs:
    - targets: ['localhost:9100']
----------------------------------------

# ์‹คํ–‰
./prometheus --config.file=prometheus.yml \
             --storage.tsdb.path=../prometheus-tsdb \
             --storage.tsdb.min-block-duration=2h \
             --storage.tsdb.max-block-duration=2h

์—ฌ๊ธฐ๊นŒ์ง€ ์ž˜ ์‹คํ–‰๋˜์—ˆ๋‹ค๋ฉด http://0.0.0.0:9090/์— ์ ‘์†ํ•  ์ˆ˜ ์žˆ๋‹ค.
์ ‘์† ํ›„ node_memory_inactive_bytes ๋“ฑ node_exporter์—์„œ ์ˆ˜์ง‘๋œ ๋ฉ”ํŠธ๋ฆญ๋„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Step.3  Thanos  ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ ๋‹ค์šด๋กœ๋“œ ๋ฐ ๋นŒ๋“œ

ํƒ€๋…ธ์Šค ๊ตฌ์„ฑ์„ ์œ„ํ•ด https://github.com/thanos-io/thanos/releases ์—์„œ ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ์„ ๋นŒ๋“œํ•˜์ž..
mac m1์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ๋ฐ”์ด๋„ˆ๋ฆฌ ์‹คํ–‰์ด ์•ˆ๋˜์„œ ๋นŒ๋“œ ํ–ˆ๋‹ค......... ๐Ÿ˜ญ๐Ÿ˜ญ๐Ÿ˜ญ๐Ÿ˜ญ๐Ÿ˜ญ

# ๋‹ค์šด๋กœ๋“œ ๋ฐ ์„ค์น˜
# go 1.21 ์ด์ƒ ์„ค์น˜๋˜์–ด์žˆ์–ด์•ผ ํ•จ
wget https://github.com/thanos-io/thanos/archive/refs/tags/v0.34.1.tar.gz
tar zxvf v0.34.1.tar.gz
cd thanos-0.34.1
go mod tidy
make build
# ๋นŒ๋“œ๊ฐ€ ์™„๋ฃŒ๋˜๋ฉด ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ์ด ์ƒ์„ฑ๋˜๋Š”๋ฐ, ์ด ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ด์•ผํ•˜๋ฏ€๋กœ ์œ„์น˜๋ฅผ ์ž˜ ํ™•์ธํ•ด์•ผํ•จ

 

Step.4 Thanos sidecar ์‹คํ–‰

./thanos sidecar \
         --prometheus.url=http://localhost:9090 \
         --grpc-address=localhost:10901 \
         --http-address=localhost:10902 \
         --tsdb.path ./prometheus-tsdb \       # ํ”„๋กœ๋ฉ”ํ…Œ์šฐ์Šค์˜ TSDB๊ฒฝ๋กœ
         --objstore.config-file=./bucket.yml

 

Step.5 Thanos store ์‹คํ–‰

Thanos Store๋ฅผ ์‹คํ–‰์‹œํ‚จ๋‹ค.

# ๋ฒ„ํ‚ท ์„ค์ •
cat <<EOF > bucket.yml
type: S3
config:
  bucket: bucket
  access_key: admin
  secret_key: password
  endpoint: 127.0.0.1:9000
  insecure: true
EOF

# store ์ž„์‹œ ๋””๋ ‰ํ† ๋ฆฌ ์ƒ์„ฑ
mkdir thanos-store

# ์‹คํ–‰
./thanos store \
         --data-dir=./thanos-store \
         --objstore.config-file=./bucket.yml \
         --http-address=localhost:10906 \
         --grpc-address=localhost:10905

 

Step.6 Thanos query ์‹คํ–‰ ๋ฐ ๋™์ž‘ ํ™•์ธ

./thanos query \
         --http-address=0.0.0.0:29090 \
         --grpc-address=localhost:10903 \
         --store=localhost:10901 \
         --query.replica-label prometheus-1

์—ฌ๊ธฐ๊นŒ์ง€ ์‹คํ–‰ํ–ˆ๋‹ค๋ฉด http://localhost:29090/ ์—ฌ๊ธฐ๋กœ ์ ‘์†ํ•ด๋ณด์ž.
thanos UI๊ฐ€ ์‹คํ–‰๋œ๋‹ค.
์ •์ƒ์ ์œผ๋กœ ์‹คํ–‰ ๋˜์—ˆ๋‹ค๋ฉด, ์•„๊นŒ prometheus UI์—์„œ ๊ฒ€์ƒ‰ํ–ˆ๋˜๊ฒƒ ๊ทธ๋Œ€๋กœ ์—ฌ๊ธฐ์—์„œ๋„ ์ฟผ๋ฆฌํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค.