๋ฐ˜์‘ํ˜•

๐Ÿ”น ๋ฌธ์ œ

์ œ์กฐ๊ณต์ •์—์„œ์˜ ๊ณ ์žฅ ๋ถ„์„์ด๋‚˜ ํ’ˆ์งˆ ์ด์ƒ ์›์ธ์„ ๋‹จ์ˆœ ๋ฌธ์„œ ๊ฒ€์ƒ‰ ๊ธฐ๋ฐ˜ RAG๋กœ๋Š” ์ •ํ™•ํžˆ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ต๋‹ค.
๊ณต์ •·์„ค๋น„·๋ถ€ํ’ˆ·๊ฒฐํ•จ ๊ฐ„ ๊ด€๊ณ„๋ฅผ ์ดํ•ดํ•˜๊ณ  ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๋Š” Graph RAG (Knowledge Graph ๊ธฐ๋ฐ˜ RAG) ์‹œ์Šคํ…œ์„ ์„ค๊ณ„ํ•˜์‹œ์˜ค.

๋‹ค์Œ ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜์—ฌ ์„ค๋ช…ํ•˜์‹œ์˜ค.

  1. Graph RAG์˜ ๊ฐœ๋…๊ณผ ๊ธฐ์กด RAG์™€์˜ ์ฐจ์ด
  2. Graph ๊ธฐ๋ฐ˜ ์ง€์‹ ๊ตฌ์กฐ(์—”ํ‹ฐํ‹ฐ·๊ด€๊ณ„·์†์„ฑ) ์„ค๊ณ„
  3. Graph + Vector Hybrid Retrieval ๊ตฌ์กฐ
  4. Graph ๊ธฐ๋ฐ˜ ์ถ”๋ก  ๋ฐ LLM ํ†ตํ•ฉ ๋ฐฉ์‹
  5. ํ’ˆ์งˆ ๊ฒ€์ฆ ๋ฐ Explainability ํ™•๋ณด ๋ฐฉ๋ฒ•

๐Ÿ’ก ๋ชจ๋ฒ”๋‹ต์•ˆ (์ƒ์„ธ)

(1) ๊ฐœ๋… ์ฐจ์ด

ํ•ญ๋ชฉ์ผ๋ฐ˜ RAGGraph RAG
๊ฒ€์ƒ‰ ๋‹จ์œ„ ๋ฌธ๋‹จ(Chunk) ์—”ํ‹ฐํ‹ฐ(Entity)·๊ด€๊ณ„(Relation)
๊ตฌ์กฐ Vector DB Graph DB (Neo4j/Arango)
์งˆ์˜ ๋ฐฉ์‹ ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰ ๊ฒฝ๋กœ ํƒ์ƒ‰(Query Path)
์žฅ์  ๋น ๋ฆ„ ๊ด€๊ณ„ ๊ธฐ๋ฐ˜ ๋งฅ๋ฝ ์ดํ•ด/์ถ”๋ก  ๊ฐ€๋Šฅ

(2) ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

  • ์—”ํ‹ฐํ‹ฐ(Entity): ์„ค๋น„, ๊ณต์ •๋‹จ๊ณ„, ๋ถ€ํ’ˆ, ๊ฒฐํ•จ, ์„ผ์„œ
  • ๊ด€๊ณ„(Relation):
    • (๊ณต์ •๋‹จ๊ณ„)-[์‚ฌ์šฉ]->(์„ค๋น„)
    • (์„ค๋น„)-[์›์ธ]->(๊ฒฐํ•จ)
    • (๊ฒฐํ•จ)-[๊ด€๋ จ]->(ํ’ˆ์งˆ์ง€ํ‘œ)
  • ์†์„ฑ(Attribute): ์˜จ๋„, ์••๋ ฅ, ์‹œ๊ฐ„, ์ƒ์‚ฐ๋Ÿ‰ ๋“ฑ

(3) Graph + Vector Hybrid Retrieval

 
[Query]
 โ”œโ”€ Graph Query (Cypher): ๊ด€๋ จ ๋…ธ๋“œ/์—ฃ์ง€ ํƒ์ƒ‰
 โ”œโ”€ Vector Search (pgvector): ์˜๋ฏธ ์œ ์‚ฌ ํ…์ŠคํŠธ ๊ฒ€์ƒ‰
 โ””โ”€ Merge & Re-rank (Cross-Encoder)
  • Graph Context → ๊ทผ๊ฑฐ ๋ฌธ๋‹จ ์ถ”์ถœ → LLM Prompt ์‚ฝ์ž…

(4) ์ถ”๋ก  ๋ฐ ํ†ตํ•ฉ

  • LLM ์ž…๋ ฅ:
  •  
    Context = Graph Path + ๊ด€๋ จ ๋ฌธ์„œ ๋ธ”๋ก
  • ReasoningAgent๊ฐ€ Graph Path ๋‚ด ๋…ธ๋“œ ๊ด€๊ณ„๋ฅผ ๋‹จ๊ณ„๋ณ„ ์„ค๋ช…
  • ์˜ˆ: “๋ˆ„์  ๊ฒฐํ•จ๋ฅ  ์ฆ๊ฐ€ → ํŠน์ • ์„ค๋น„๊ตฐ(A01, A02) ๊ด€๋ จ Edge 3ํšŒ ๋“ฑ์žฅ → ์ฃผ์š” ์›์ธ ํ›„๋ณด ๋„์ถœ”

(5) Explainability

์ˆ˜์ค€๊ตฌํ˜„ ๋ฐฉ์‹
๋ชจ๋ธ Graph ๊ฒฝ๋กœ ์‹œ๊ฐํ™”(Neo4j Bloom)
LLM Reasoning Chain Logging (CoT ์ถœ๋ ฅ)
๋ณด๊ณ ์„œ “๊ทผ๊ฑฐ ๊ฒฝ๋กœ ์š”์•ฝ: (๊ณต์ •#12→์„ค๋น„A→๊ฒฐํ•จLeak)” ์‚ฝ์ž…
ํ‰๊ฐ€ Path Accuracy ≥ 0.85, Groundedness ≥ 0.9

(6) ์šด์˜ ์˜ˆ์‹œ

  • DB: Neo4j + pgvector Hybrid Query
  • ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜: Airflow/LangGraph
  • ์‹œ๊ฐํ™”: Grafana + Bloom View
  • ์žฅ์ : ๋‹จ์ˆœ ๊ฒ€์ƒ‰ํ˜• RAG ๋Œ€๋น„ ์„ค๋ช…๊ฐ€๋Šฅ์„ฑ + ์ถ”๋ก ๋ ฅ + ๋„๋ฉ”์ธ ๋งฅ๋ฝ์„ฑ ๊ฐ•ํ™”
๋ฐ˜์‘ํ˜•
๋ฐ˜์‘ํ˜•

๐Ÿ”น ๋ฌธ์ œ

์ œ์กฐ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐ ๋ณด๊ณ  ์ž๋™ํ™” ์‹œ์Šคํ…œ์—์„œ ๋‹ค์ˆ˜์˜ Agent(DataAgent, RAGAgent, AnalysisAgent, ReportAgent, EvalAgent)๊ฐ€ ํ˜‘๋ ฅํ•˜์—ฌ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.
์ด๋•Œ, ์—์ด์ „ํŠธ ๊ฐ„ ๋ณ‘๋ชฉ์ด๋‚˜ ๋น„ํšจ์œจ์„ ์ค„์ด๊ณ  ์„ฑ๋Šฅ์„ ๊ทน๋Œ€ํ™”ํ•˜๊ธฐ ์œ„ํ•ด LangGraph ๊ธฐ๋ฐ˜ ๋™์  ์˜ค์ผ€์ŠคํŠธ๋ ˆ์ด์…˜ ๊ตฌ์กฐ๋ฅผ ์„ค๊ณ„ํ•˜๋ ค ํ•œ๋‹ค.

๋‹ค์Œ ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜์—ฌ ๊ธฐ์ˆ ํ•˜์‹œ์˜ค.

  1. Multi-Agent ํ˜‘๋ ฅ ๊ตฌ์กฐ์˜ ๊ธฐ๋ณธ ์›๋ฆฌ ๋ฐ ์„ค๊ณ„ ๊ฐœ๋…
  2. LangGraph ๋˜๋Š” Argo Workflow ๊ธฐ๋ฐ˜์˜ ๋ณ‘๋ ฌ·๋™์  ์‹คํ–‰ ์ „๋žต
  3. ์—์ด์ „ํŠธ ๊ฐ„ ๋ฐ์ดํ„ฐ ๊ตํ™˜ ๋ฐ ์ƒํƒœ(State) ๊ด€๋ฆฌ ์„ค๊ณ„
  4. ์—์ด์ „ํŠธ ์‹คํŒจ ์‹œ ๋™์  ์žฌ๊ณ„ํš(Re-Planning)·๋ณต๊ตฌ ์ ˆ์ฐจ
  5. ์‹œ์Šคํ…œ ์ตœ์ ํ™”๋ฅผ ์œ„ํ•œ ์ง€ํ‘œ ๊ธฐ๋ฐ˜ Self-Tuning ๋ฐฉ์•ˆ

๐Ÿ’ก ๋ชจ๋ฒ”๋‹ต์•ˆ (์ƒ์„ธ)

(1) ๊ตฌ์กฐ ๊ฐœ๋…

  • Agentic AI ์‹œ์Šคํ…œ์€ ๋ชจ๋“ˆํ˜•(Micro-agent) ๊ตฌ์กฐ๋กœ ๋ถ„๋ฆฌ → ๋ณ‘๋ ฌ ์ˆ˜ํ–‰ ๊ฐ€๋Šฅ
  • SuperAgent(Planner)๊ฐ€ Task๋ฅผ DAG ํ˜•ํƒœ๋กœ ๋ถ„ํ• , ๊ฐ Node๋Š” Agent
  • ๊ฐ Agent๋Š” ์—ญํ•  ๊ธฐ๋ฐ˜ ์‹คํ–‰ → ๊ฒฐ๊ณผ ๊ณต์œ (Artifact + Memory)

(2) LangGraph ๊ธฐ๋ฐ˜ ์„ค๊ณ„ ์˜ˆ์‹œ

 
[SuperAgent]
 โ”œโ”€(A) DataAgent → ETL/Feature Extract
 โ”œโ”€(B) RAGAgent → ๋ฌธ์„œ ๊ฒ€์ƒ‰/์š”์•ฝ
 โ”œโ”€(C) AnalysisAgent → ์˜ˆ์ธก/ํ†ต๊ณ„
     โ”‚ (A,B ๋ณ‘๋ ฌ ์‹คํ–‰ ํ›„ ์ข…์†)
 โ”œโ”€(D) ReportAgent → ๋ณด๊ณ ์„œ ์ƒ์„ฑ(LLM)
 โ””โ”€(E) EvalAgent → ํ’ˆ์งˆํ‰๊ฐ€ ํ›„ ํ”ผ๋“œ๋ฐฑ
  • ๋ณ‘๋ ฌํ™”: A·B ๋™์‹œ ์‹คํ–‰, ์ดํ›„ C·D·E ์ˆœ์ฐจ ์ˆ˜ํ–‰
  • Dynamic DAG: ์‹คํ–‰ ์ค‘ ์‹คํŒจ ์‹œ Task ๊ฒฝ๋กœ ์ˆ˜์ •

(3) ์ƒํƒœ ๊ด€๋ฆฌ

๋ฐ์ดํ„ฐ์œ ํ˜•์ €์žฅ์†Œ์šฉ๋„
์ค‘๊ฐ„๊ฒฐ๊ณผ Redis / Mongo Agent๊ฐ„ ๊ตํ™˜
๋ชจ๋ธ๊ฒฐ๊ณผ S3 / Delta Lake Persisted Artifact
์‹คํ–‰๋กœ๊ทธ Loki / Prometheus ์ถ”์  ๋ฐ ๋ณต๊ตฌ
์„ธ์…˜๋ฉ”๋ชจ๋ฆฌ LangGraph MemoryStore ํ”„๋กฌํ”„ํŠธ ์ปจํ…์ŠคํŠธ ์œ ์ง€

(4) ์‹คํŒจ ๋ณต๊ตฌ (Re-Planning)

  • Reflective Loop: ์‹คํŒจ Task ๊ฐ์ง€ → ์›์ธ ๋ถ„์„(Log) → ๋Œ€์ฒด ๊ฒฝ๋กœ ๊ณ„ํš
  • Retry Policy: Backoff 1→2→4→8s (์ตœ๋Œ€ 3ํšŒ)
  • Fallback Agent: ๋™์ผ ๊ธฐ๋Šฅ ์˜ˆ๋น„ ๋ชจ๋ธ ์‚ฌ์šฉ (e.g., OpenSearch→BM25 backup)
  • SuperAgent Decision: ์‹คํŒจ ํŒจํ„ด ๋ˆ„์  → Self-Rule ๊ฐœ์„ 

(5) Self-Tuning

  • KPI: Latency, Cost/Report, Groundedness, Task Success
  • 10๋ถ„ ๊ฐ„๊ฒฉ์œผ๋กœ ์ˆ˜์ง‘๋œ ๋กœ๊ทธ ๊ธฐ๋ฐ˜ → Threshold ์ดˆ๊ณผ ์‹œ
    • Prompt ์••์ถ•
    • k๊ฐ’ ์กฐ์ •(RAG)
    • ๋ชจ๋ธ Tier ๋ณ€๊ฒฝ(7B↔13B)
  • Reinforcement Feedback์œผ๋กœ Planner ์˜์‚ฌ๊ฒฐ์ • ๋ณด์ •

 

๋ฐ˜์‘ํ˜•
๋ฐ˜์‘ํ˜•

๐Ÿ”น ๋ฌธ์ œ

Agentic AI ์‹œ์Šคํ…œ์ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ(์ƒ์‚ฐ์ •๋ณด, ์„ค๋น„๋กœ๊ทธ, ๋งค๋‰ด์–ผ, ์‚ฌ์šฉ์ž ์งˆ์˜)๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ณผ์ •์—์„œ
๋ณด์•ˆ ๋ฐ ๊ทœ์ œ ์ค€์ˆ˜๋ฅผ ์œ„ํ•ด ์ •์ฑ… ๊ธฐ๋ฐ˜ ์ ‘๊ทผ์ œ์–ด(Policy-as-Code) ์™€ ๊ฐ์‚ฌ ์ถ”์ (Audit Trail) ์ด ํ•„์š”ํ•˜๋‹ค.

๋‹ค์Œ ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜์—ฌ ์„ค๊ณ„ํ•˜์‹œ์˜ค:

  1. ๋ณด์•ˆ·์ •์ฑ… ๊ด€๋ฆฌ๊ฐ€ ํ•„์š”ํ•œ ์ด์œ ์™€ ์ ์šฉ ๋ฒ”์œ„
  2. RBAC/ABAC/OPA๋ฅผ ์ด์šฉํ•œ ๊ถŒํ•œ·์ •์ฑ… ๊ด€๋ฆฌ ๊ตฌ์กฐ
  3. ์ •์ฑ… ์œ„๋ฐ˜ ๊ฐ์ง€ ๋ฐ ๋Œ€์‘ ์ ˆ์ฐจ(Self-Healing ํฌํ•จ)
  4. ๋กœ๊ทธ ๊ฐ์‚ฌ ๋ฐ ์ถ”์  ์ฒด๊ณ„(Audit Trail) ์„ค๊ณ„
  5. ๊ฐœ์ธ์ •๋ณด·์•ˆ์ „·๊ทœ์ • ์œ„๋ฐ˜ ์‘๋‹ต ๋ฐฉ์ง€ ์ „๋žต

๐Ÿ’ก ๋ชจ๋ฒ”๋‹ต์•ˆ (์ƒ์„ธ)

(1) ํ•„์š”์„ฑ

  • ์ œ์กฐ ๋ฐ์ดํ„ฐ์—๋Š” PII(๊ฐœ์ธ์ •๋ณด), ์„ค๋น„ ๋‚ด๋ถ€ ๊ทœ๊ฒฉ, ์•ˆ์ „ ๊ด€๋ จ ๋งค๋‰ด์–ผ ํฌํ•จ → ์ ‘๊ทผ์ œ์–ด ํ•„์ˆ˜.
  • LLM ๊ธฐ๋ฐ˜ ์‹œ์Šคํ…œ์€ ๋น„์ธ๊ฐ€ ์งˆ์˜๋‚˜ ๊ทœ์ • ์œ„๋ฐ˜ ์‘๋‹ต์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Œ → ์ •์ฑ… ๊ธฐ๋ฐ˜ ์ œ์–ด ํ•„์š”.

(2) ๊ตฌ์กฐ ์„ค๊ณ„

[User Request]
   ↓
[Keycloak] → RBAC/ABAC ํ† ํฐ ๋ฐœ๊ธ‰
   ↓
[OPA Policy Engine] → ์ •์ฑ… ๊ฒ€์ฆ (์ •์ฑ…: rego)
   ↓
[LangGraph Agent Flow]
   ↓
[Audit Logger (Loki/Elastic)]
  • RBAC: ์‚ฌ์šฉ์ž ์—ญํ• ๋ณ„ ์ ‘๊ทผ (์˜ˆ: Engineer, QA, Admin)
  • ABAC: ์†์„ฑ ๊ธฐ๋ฐ˜(๋ผ์ธ, ์„ค๋น„, ๋ฌธ์„œ๋“ฑ๊ธ‰) ์กฐ๊ฑด๋ถ€ ์ ‘๊ทผ
  • OPA(Policy-as-Code): Rego ๊ทœ์น™์œผ๋กœ ์งˆ์˜/์ถœ๋ ฅ ๊ฒ€์ฆ

(3) ์ •์ฑ… ์œ„๋ฐ˜ ๋Œ€์‘

์ƒํ™ฉ๋Œ€์‘
๋ฏผ๊ฐ์ •๋ณด ํƒ์ง€ PII Masking / ์ž๋™ ์ฐจ๋‹จ
์•ˆ์ „๋ฌธ์„œ ๋ฏธ์ธ์šฉ SafePrompt + HITL ์š”์ฒญ
ํ—ˆ์šฉ ๋ฒ”์œ„ ์ดˆ๊ณผ ์‘๋‹ต ReportAgent ์žฌ์ƒ์„ฑ or ์ค‘๋‹จ
์ง€์† ์œ„๋ฐ˜ ๋ฐœ์ƒ Agent Offload + ๊ด€๋ฆฌ์ž ์Šน์ธ ํ•„์š”

Self-Healing ์˜ˆ์‹œ

 
if PolicyViolation: trigger(RePrompt with PolicyGuide) else: continue

(4) Audit Trail

  • ๋กœ๊ทธ: trace_id, user_id, policy_rule, agent_id, timestamp, decision
  • ์ €์žฅ: Loki/Elastic + Grafana ๋Œ€์‹œ๋ณด๋“œ
  • ๋ณด๊ณ : ์›”๊ฐ„ Audit Summary ์ž๋™ ๋ณด๊ณ  (์ •์ฑ… ์œ„๋ฐ˜ ๊ฑด์ˆ˜, ์‘๋‹ต ์ˆ˜์ • ๋‚ด์—ญ ๋“ฑ)

(5) ์œค๋ฆฌ/์•ˆ์ „ ๋Œ€์‘

  • SafePrompt: "์•ˆ์ „ ๊ด€๋ จ ๊ทœ์ • ์ธ์šฉ์ด ๋ˆ„๋ฝ๋˜์ง€ ์•Š๋„๋ก ํ™•์ธํ•˜๋ผ"
  • PII Scrubber: ๊ฐœ์ธ์ •๋ณด ๋ฌธ์ž์—ด(์ „ํ™”๋ฒˆํ˜ธ, ์ด๋ฆ„ ๋“ฑ) ํ•„ํ„ฐ
  • Bias Detector: ๋ถˆ๊ณต์ • ์‘๋‹ต์‹œ ์ž๋™ ๊ต์ • ์š”์ฒญ
  • Human Override: ์ผ์ • ๊ธฐ์ค€ ๋ฏธ๋งŒ ์‹ ๋ขฐ๋„๋Š” ๋ฐ˜๋“œ์‹œ ์‚ฌ๋žŒ ์Šน์ธ ํ•„์š”
๋ฐ˜์‘ํ˜•
๋ฐ˜์‘ํ˜•

๐Ÿ”น ๋ฌธ์ œ

์ œ์กฐ ํ˜„์žฅ์˜ ๊ณต์ • ์กฐ๊ฑด์ด ๊ณ„์ ˆ·์„ค๋น„·์ž์žฌ ๋ณ€ํ™”์— ๋”ฐ๋ผ ์ฃผ๊ธฐ์ ์œผ๋กœ ๋ณ€๋™๋œ๋‹ค.
์ด์— ๋Œ€์‘ํ•˜์—ฌ AI ๋ถ„์„ ์‹œ์Šคํ…œ์ด ์Šค์Šค๋กœ ์ ์‘ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์†ํ•™์Šต(Lifelong Learning) ๊ตฌ์กฐ๋ฅผ Agentic AI ์‹œ์Šคํ…œ ๋‚ด์— ํ†ตํ•ฉํ•˜๋ ค ํ•œ๋‹ค.

๋‹ค์Œ ํ•ญ๋ชฉ์„ ํฌํ•จํ•˜์—ฌ ์„ค๊ณ„ ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•˜์‹œ์˜ค:

  1. ์ง€์†ํ•™์Šต์ด ํ•„์š”ํ•œ ์ด์œ ์™€ ์ฃผ์š” ๊ตฌ์„ฑ ์š”์†Œ
  2. ๋ฐ์ดํ„ฐ ๋“œ๋ฆฌํ”„ํŠธ(Data Drift) ํƒ์ง€ ๋ฐ ์žฌํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ ์„ค๊ณ„
  3. Agent ๊ฐ„ ์—ฐ๊ณ„ ๊ตฌ์กฐ (DataAgent, TrainerAgent, EvalAgent ๋“ฑ)
  4. Catastrophic Forgetting ๋ฌธ์ œ์˜ ๋ฐฉ์ง€ ๋ฐฉ์•ˆ
  5. ์ง€์†ํ•™์Šต ์„ฑ๋Šฅ ํ‰๊ฐ€ ์ง€ํ‘œ์™€ ์šด์˜ ์ •์ฑ…

๐Ÿ’ก ๋ชจ๋ฒ”๋‹ต์•ˆ (์ƒ์„ธ)

(1) ํ•„์š”์„ฑ

  • ์ œ์กฐ๊ณต์ •์€ ์‹œ์ฆŒ·์„ค๋น„ ๊ต์ฒด·์›์ž์žฌ ๋ฐฐ์น˜ ๋“ฑ์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ๊ฐ€ ์ง€์†์ ์œผ๋กœ ๋ณ€ํ•จ → ์ •์  ๋ชจ๋ธ์€ ์˜ค๋ž˜ ์œ ์ง€ ๋ถˆ๊ฐ€.
  • Agentic ๊ตฌ์กฐ์—์„œ ์ง€์†ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ์„ ํ†ตํ•ด ๋ชจ๋ธ, RAG, Prompt, Policy ๋ชจ๋‘ ์ฃผ๊ธฐ์  ๊ฐฑ์‹  ํ•„์š”.

(2) ํŒŒ์ดํ”„๋ผ์ธ ์„ค๊ณ„

 
[DataAgent] โ”€โ”€> [Drift Detector] (Evidently PSI>0.2)
       ↓
[TrainerAgent] (Incremental Fine-tune or LoRA Update)
       ↓
[EvalAgent] (AutoEval Harness, Groundedness/Task Success)
       ↓
[Model Registry] (MLflow → Promote/Rollback)
  • ์ž๋™ ํŠธ๋ฆฌ๊ฑฐ ์กฐ๊ฑด: Drift ๊ฐ์ง€ or Task Success < 0.85
  • Retraining ๋ฐฉ์‹: LoRA·EWC ๊ธฐ๋ฐ˜ ์ฆ๋ถ„ํ•™์Šต

(3) Agent ์—ญํ• 

Agent๊ธฐ๋Šฅ
DataAgent Drift ํƒ์ง€ ๋ฐ ์žฌํ•™์Šต ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋ง
TrainerAgent ์ฆ๋ถ„ ํ•™์Šต ์ˆ˜ํ–‰, ๊ธฐ์กด ๊ฐ€์ค‘์น˜ ์ค‘์š”๋„ ๊ณ ๋ ค
EvalAgent ํ’ˆ์งˆ ์ธก์ • (Groundedness, F1, TaskSuccess)
PolicyAgent Threshold ๊ด€๋ฆฌ ๋ฐ ์Šน์ธ ์ •์ฑ… ์ˆ˜ํ–‰

(4) Catastrophic Forgetting ๋ฐฉ์ง€

  • EWC (Elastic Weight Consolidation)
  • Replay Buffer: ๊ณผ๊ฑฐ ๋ฐ์ดํ„ฐ ์ผ๋ถ€ ์œ ์ง€
  • Knowledge Distillation: ๊ตฌ๋ชจ๋ธ ์ถœ๋ ฅ→์‹ ๋ชจ๋ธ ํ•™์Šต ์‹ ํ˜ธ๋กœ ์‚ฌ์šฉ
  • Multi-Task Regularization

(5) ํ‰๊ฐ€ ์ง€ํ‘œ

ํ•ญ๋ชฉ์ •์˜๋ชฉํ‘œ
Drift Score PSI/KL ๊ธฐ๋ฐ˜ ๋ถ„ํฌ ์ฐจ์ด < 0.2
Retention Score ๊ธฐ์กด Task ์œ ์ง€์œจ ≥ 0.9
Task Success ์ตœ์‹  ๋ฐ์ดํ„ฐ ๋Œ€์‘์œจ ≥ 0.85
AutoEval Pass Rate ์ž๋™ ํ‰๊ฐ€ ํ†ต๊ณผ์œจ ≥ 90%

(6) ์šด์˜ ์ •์ฑ…

  • ์ผ์ • ๊ธฐ๊ฐ„ ๋‹จ์œ„(์˜ˆ: ์›”๋ณ„) + ์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ์žฌํ•™์Šต ๋ณ‘ํ–‰
  • ์žฌํ•™์Šต ์‹คํŒจ ์‹œ ๋กค๋ฐฑ + Human Review
  • MLflow์— ๋ชจ๋ธ ๋ฉ”ํƒ€(๋ฐ์ดํ„ฐ ๊ธฐ๊ฐ„, ๋“œ๋ฆฌํ”„ํŠธ ์›์ธ) ์ €์žฅ
๋ฐ˜์‘ํ˜•

+ Recent posts