CCHMC Internal

ClinClaw Vision

An AI assistant that operates the EMR by sight: it sees the screen, grounds every click with local OCR, reasons through the clinical task, acts through the visible UI, and verifies what happened.

Synthetic, PHI-free demos from the development team

Clinical reasoning, visualized. Watch the agent read the chart while its reasoning builds on the right: critical potassium, acute kidney injury trend, and a drug-allergy conflict.

How It Works

See

The agent captures the EMR screen locally instead of depending on hidden back-end calls.

Ground

OCR provides precise pixel coordinates, so the model can point at what it means to act on.

Decide

A vision-capable model reads the clinical context and chooses the next step in the workflow.

Verify

Each click is checked against the screen state before the agent trusts the result.

Development Team Demos

Proof

The model's verbatim reasoning

The same split-screen demo with the model's literal, unedited output from a real run.

Autonomy

Autonomous clinical safety review

The agent opens a flagged patient, reviews labs and medications by sight, and reports findings itself.

Cost

Same review on a budget model

Same clinical conclusions, roughly 13x cheaper; the smaller model takes more steps to get there.

Workflow

Guided clinical workflow

A TPN rate review presented as a step-by-step clinical workflow with on-screen annotations.

Technical

Why grounding matters

Vision-only mis-clicks; OCR-assisted coordinates land on the intended screen element reliably.

Visual Abstract

The idea

ClinClaw operates an electronic medical record the way a clinician does: by looking at the screen, deciding what to do, moving the cursor, and clicking. The design keeps protected health information local and makes every action traceable to something visible on screen.

EMR-agnostic

Because it works by sight, ClinClaw can operate across EMRs without waiting for vendor-specific interface projects.

Auditable by design

The agent acts through the visible UI and verifies state changes, creating a reviewable link between screen evidence and action.