HVAC Rescue

Software Projects · HVAC Rescue

Python · FastAPI · SQLite · Claude Agent SDK · React · Vite · Vercel · 2026

An agentic system for commercial HVAC contractors that autonomously scans a $6.4B project portfolio, flags projects bleeding margin, and tells a CFO what to do about it. Built in three days at the NYU DSC × Pulse Foundry AI NYC Datathon 2026 with Ashley Ying, Helen Li, and Nicole Zhang as Team Vibe 101.

The problem

Commercial HVAC contractors systematically lose money between bidding a project and finishing it. A typical $50M/year contractor bids at 15.2% margin and realizes 6.8%. That 8.4-point gap is not bad luck — it's a structural pattern that repeats every quarter across healthcare, data centers, K-12, and multifamily builds. By the time a PM notices margin is gone, the crew has been on site for weeks, change orders are pending, and billing is 60 days behind earned value. There's no runway left to recover.

The hackathon handed us the problem at scale: 405 projects, $6.4B total portfolio, 1.46M records spanning 2018–2024. The ask was explicit — build an agent, not a dashboard. Something that acts on the data instead of waiting for a human to squint at it.

What we built

HVAC Rescue is a two-tier AI system wrapped in a snap-scroll dashboard.

Tier 1 — Autonomous Portfolio Scan (pre-generated). On server startup, the backend loads 1.46M records into SQLite with aggregation tables (the 1.2M labor log rows collapse to ~6K per-SOV summaries), computes realized margin / variance / billing gap / budget coverage for every project using the four hackathon formulas, classifies each as critical / warning / healthy, and calls Claude Haiku 4.5 to generate one portfolio-level executive briefing plus 405 structured project assessments. Everything is cached in a SQLite ai_cache table before a user ever loads the page — first visit is instant, no streaming flicker.

Tier 2 — Deep Dive Agent (live, on demand). When a CFO clicks "Deep Dive" on a flagged project, we spin up a live Claude agent with three tools: query_database (read-only SQL over the full 1.46M rows), web_search (Tavily, for labor rates / refrigerant prices / tariff news), and calculate (variance arithmetic). The agent runs up to 10 turns, cross-referencing labor logs, field notes, change orders, billing, and RFIs, and produces a markdown report with an executive summary, root-cause analysis, dollar-quantified recovery actions, and a forward projection — "if current trajectory continues, this project finishes at approximately $X with a realized margin of Y%." Every tool call streams to the frontend as a chat-style dialogue with a rolling 5-call scroll window.

The interface

Three sections, one continuous snap-scroll page with URLs that update as you scroll (/overview, /visual, /projects).

Overview gives a CFO the 30-second read: stat cards, a 3D auto-rotating scatter of all 405 projects, the critical-project list, the AI executive briefing, cohort margins, and the margin distribution.

Visual is the forensics floor: US map colored by margin, change-order analysis, labor actual-vs-budget, RFI timeline, margin by cohort, material vs budget — each chart with selectable types and a one-sentence AI insight.

Projects / Deep Dive is where the agent lives. A sortable project list with thumbnails feeds into per-project detail pages with a 20-metric grid, three financial charts, clickable LaTeX formula popups (MathJax), the pre-generated AI summary, a change-order filter table, a floating notes FAB with CSV export, and the live agent investigation streaming underneath.

Under the hood

Backend — FastAPI + SQLite, deployed self-hosted behind a Vercel /api/* proxy at dsc-nyu-datathon.ethanpan.me. Dependency management via uv. Pre-aggregation is pure pandas.
AI — Anthropic Claude Haiku 4.5 for pre-generation, a full Claude Agent loop for deep dives. Every prompt is versioned in docs/05-prompt-design.md with justification.
Frontend — React + Vite on Vercel. Streamdown (by Vercel) handles streaming markdown, Geist is the typeface, MathJax renders the formula popups, and we iterated UI in v0.
Data — 1.46M rows reduced to ~6K per-SOV summaries in SQLite. Server queries return in under 100ms.

Full technical write-up lives in the repo's docs/ folder — architecture, agent design, prompt design, deployment, challenges, and a rubric self-assessment.

What I learned

The real lesson was about where to put the LLM. Our first instinct was to make everything live — agent on every page load, streaming everywhere. It was slow and flaky. Splitting the problem into pre-generated at startup (portfolio-wide, deterministic, cached) and live on demand (one project, transparent tool calls, user-initiated) made the system feel instant and deep. That split — batch where you can, live where you must — is something I'll reach for every time I build agentic UX from now on.

The second lesson was that transparency is a feature. Showing every SQL query and web search the agent ran turned a black-box LLM into something a CFO could actually trust. That trace is the product.

From LinkedIn

https://www.linkedin.com/posts/ethan-pyy_vibe101-activity-7449209114989215744--FkL

This is the plain-HTML mirror served to crawlers, LLMs, and curl. Humans with a JavaScript-enabled browser see the rich React/XP-themed SPA at the same URL.

All plain pages · Live site · sitemap.xml