A real-time caregiver alert system that uses on-device computer vision and low-latency video streaming to give caregivers fast, trustworthy context when a resident needs attention.
Every feature in CareVision exists to serve one loop: something happens in a room, a caregiver finds out immediately, sees live context, and takes action. The entire cycle completes in under two seconds on a local network.
Camera runs at 30fps,
Vision samples at ~5fps
Bed exit, upright pose,
or motion spike
Validates, deduplicates,
broadcasts via SSE
Real-time push,
no polling needed
Sub-second video,
motion overlay
Each layer does one thing well and communicates through well-defined contracts. The iOS app handles capture and display. The backend handles orchestration. WebRTC handles media.
SwiftUI + AVFoundation + Vision + LiveKit
Camera capture, on-device Vision analysis, bed-exit state machine, LiveKit publish
SSE alert stream, live video subscriber, alert inbox, acknowledge actions, timeline
Renders skeleton, bounding box, and motion trail on the live stream
Node.js + TypeScript + Fastify + Zod
Demo sessions with bearer tokens, LiveKit JWT generation with role-based permissions
CV event processing with idempotency, alert state machine, timeline audit trail
Real-time push to all connected caregivers when alerts are created or updated
WebRTC + SSE + REST
Low-latency video streaming between sensor and caregiver devices
Unidirectional push for alerts and timeline updates, no client polling
Session creation, token requests, alert acknowledgment, timeline queries
The Vision pipeline runs entirely on the iPhone using Apple's framework. No video frames are uploaded. No cloud ML. The phone processes roughly 5 frames per second and feeds the results into a state machine that decides when something clinically relevant is happening.
AVFoundation grabs camera frames at 30fps. The pipeline samples every ~200ms to keep CPU reasonable.
Vision finds people in the frame (bounding box) and estimates 19 body pose landmarks per person.
Calculates upright score, leg extension, motion delta, and zone occupancy from the pose data.
BedExitDetector state machine applies persistence and cooldown windows to avoid false positives.
Every alert follows a predictable lifecycle. There is no ambiguity about what state an alert is in or who acted on it. The timeline records every transition.
Just created
Caregiver saw it
Needs more help
Alerts can also go directly from New to Escalated, skipping acknowledgment. There is no reopen or resolve in the MVP — once escalated, the alert is terminal.
We do not use one protocol for everything. Each communication path uses the transport that matches its latency, reliability, and direction requirements.
Live video + data channel
Sub-second latency video from sensor to caregiver. Also carries the motion overlay data via an unreliable data channel so the stream is never blocked by overlay failures.
Alerts + timeline updates
Persistent HTTP connection from the backend to every connected caregiver. When an alert is created or updated, the backend pushes it instantly. No polling, no WebSocket complexity.
JSON over HTTP
Used for actions that need confirmation: creating sessions, requesting tokens, acknowledging alerts, querying the timeline. Standard request-response where you need a guaranteed result.
These are the performance budgets for the local demo environment. In a care setting, every second between detection and response matters.
The codebase is organized by responsibility. Backend source is 5 files. The iOS app separates features, services, and shared components.