site-intelligence-agent

Site Intelligence Agent

**Site Intelligence Agent

April 2026**

What This Is

A RAG-based assistant for frontline field workers (HVAC technicians, drone inspectors, field engineers) that retrieves relevant documentation, past job history, and compliance requirements in real time during a job.

The system spans two domains — HVAC (Phase 1, complete) and drone inspection (Phase 2) — handled by a dual-domain SiteIntelligenceAgent with a shared pipeline and domain-aware classifier, session memory, and spatial zone filtering.

The core design question: what happens when the retrieved context is incomplete or contradictory? Most RAG systems stop at “the model found something.” This system is designed for when it does not.

Three Routing Paths (Graceful Degradation)

Confidence	Trigger	Response
HIGH	Top result similarity > 0.75, no conflict	Answer with source citations
PARTIAL	Conflicting sources, or similarity 0.50–0.75	Answer with explicit conflict flag and both sources
LOW	Similarity < 0.50, or no results	Escalation message with what was and was not found

Streamlit Walkthrough App

A 6-page persona-driven walkthrough ships with the project (streamlit run Home.py):

Page	What it does
Home	Cinematic scenario intro with CTA
1 — Ask the Agent	Interactive demo of all three confidence paths (HIGH / PARTIAL / LOW)
2 — View the Site	Dual-domain site overview
3 — Inspect a Zone	Drone agent with Zone C pre-loaded, session memory and pipeline trace
4 — See the Proof	Eval metrics dashboard
5 — Find the Gaps	“Honesty Report” — coverage heatmap (green = answered, red = escalated)
6 — Connect the Dots	Memory graph finale and summary card

Setup

# Install dependencies
pip install -r requirements.txt

# Add your API key
cp .env.example .env
# Edit .env and add ANTHROPIC_API_KEY

# Add documents to data/raw/ (see Data Sources below)

# Ingest HVAC collections (Phase 1)
python src/ingest.py

# Ingest drone inspection collections (Phase 2)
python src/ingest.py --domain drone

# Ingest both domains
python src/ingest.py --domain all

# Generate synthetic HVAC job history
python src/generate_synthetic.py

# Generate synthetic drone inspection data
python src/generate_drone_data.py

# Launch Streamlit walkthrough app
streamlit run Home.py

# Run CLI demo with three preset queries (HIGH / PARTIAL / LOW)
python demo/demo.py

# Run evaluation suite
python eval/run_eval.py

Data Sources

HVAC domain:

Collection	What	Where to get it
`osha`	29 CFR 1910.147 (Lockout/Tagout), 29 CFR 1910.303 (Electrical Safety)	osha.gov (public domain)
`manuals`	HVAC equipment manuals (Carrier 48LC 2017/2023, Lennox SL280, Trane XR15)	Manufacturer sites (public)
`job_history`	Synthetic job records	Run `generate_synthetic.py`

Drone inspection domain:

Collection	What	Where to get it
`inspection_records`	Synthetic drone inspection records	Run `generate_drone_data.py`
`historical_baselines`	Historical baseline metrics per zone	Run `generate_drone_data.py`
`compliance_docs`	OSHA 1926.452 scaffold safety	osha.gov (public domain)

Project Structure

src/
  assistant.py           # FieldServiceAssistant (HVAC) + SiteIntelligenceAgent (dual-domain)
  classifier.py          # Rule-based + LLM fallback query classification
  session_memory.py      # Zone/equipment/time entity tracking across turns
  retriever.py           # Vector retrieval, spatial filter, conflict detection
  confidence.py          # Confidence scoring + routing decision
  degradation.py         # Graceful degradation (HIGH / PARTIAL / LOW)
  llm.py                 # LLM wrapper (Anthropic / OpenAI / Gemini)
  ingest.py              # Document loading + Chroma indexing (HVAC + Drone)
  generate_synthetic.py  # Synthetic HVAC job history
  generate_drone_data.py # Synthetic drone inspection data

pages/
  1_Ask_the_Agent.py     # Three-path confidence demo
  2_View_the_Site.py     # Site overview
  3_Inspect_a_Zone.py    # Drone agent + session memory trace
  4_See_the_Proof.py     # Eval metrics dashboard
  5_Find_the_Gaps.py     # Honesty report / coverage heatmap
  6_Connect_the_Dots.py  # Memory graph finale

eval/
  ground_truth.json      # 50 HVAC query-answer pairs
  adversarial.json       # 20 out-of-corpus queries
  contradictions.json    # 15 conflict scenarios
  run_eval.py            # Evaluation runner

demo/
  demo.py                # CLI demo for live presentation

Troubleshooting

Duplicate chunks in Chroma (if ingest was run more than once):

rm -rf data/chroma_db/
python src/ingest.py --domain all

Never run ingest.py twice without clearing data/chroma_db/ first — each run appends to existing collections.

Switching LLM providers — set LLM_PROVIDER in .env (anthropic / openai / gemini). Current working model: gemini-2.5-flash.

Tuning confidence thresholds without code changes:

CONFIDENCE_HIGH_THRESHOLD=0.80 CONFIDENCE_PARTIAL_THRESHOLD=0.55 python demo/demo.py

References

Lewis et al. “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.” NeurIPS 2020. https://arxiv.org/abs/2005.11401
Gao et al. “RAG for Large Language Models: A Survey.” arXiv 2023. https://arxiv.org/abs/2312.10997
OSHA Field Operations Manual. U.S. DOL, 2024. https://www.osha.gov/enforcement/directives/cpl-02-00-164
Anthropic. Claude Model Card. 2024. https://www.anthropic.com/claude