Optimize Video Event Processing with time-window based approach

Vincent

2025-01-24

Programming

The Challenge

Recently, our Video Surveillance as a Service system encountered a significant challenge with event processing and notification management. Via the Video Management System, the system receives multiple 10-second video recordings from security CCTV cameras when an intrusion is detected. These recordings, captured at the same location within short time intervals, were being processed independently. This led to notification flooding, where users received separate alerts for what was essentially a single continuous event.

Understanding the problem

Our customers are using Dahua security cameras, which have built-in intrusion detection functions, so when a possible intrusion triggers a security camera through a Video Management System, it creates a series of 10-second video clips. Consider a scenario where someone walks through a monitored area for 30 seconds. This single event generates three separate video clips, and under the original system architecture, each clip triggered its own analysis and notification. This approach created several issues:

1. It fragmented what should be treated as a single event into multiple separate notifications.
1. It increased processing overhead by analyzing each short clip independently.
1. It diminished the system’s ability to provide comprehensive event analysis, as each clip was evaluated in isolation rather than as part of a continuous sequence.

Solution Design

To address these challenges, I have implemented a time-window based event grouping system. This approach treats multiple recordings as part of a single event if they occur within a defined time window, typically 120 seconds (2 minutes), and are from the same camera. The system manages these events through three distinct states:

1. First Recording: When the first recording arrives, it triggers the creation of a new time window and initiates immediate processing and notification.
1. Intermediate Recordings: Subsequent recordings are counted but not individually processed within the time window, reducing system overhead while maintaining event tracking.
1. Window Completion: The time window closes either when it reaches its duration limit or when activity ceases, triggering a comprehensive analysis of the entire event sequence.

Technical Implementation

Let me walk through the detailed implementation, highlighting how we handle different recording scenarios and state transitions.

Event Window Model

The EventWindow model encapsulates all necessary information for tracking and processing video events:

class EventWindow(BaseModel):
    device_id: str
    state: EventState
    start_time: str
    first_bookmark_id: str
    recording_count: int
    site_name: str
    window_closing_time: datetime

Event Processing Logic

The process_recording method serves as the central component of the implementation, handling three distinct scenarios: new windows, continued recording within a window, and recordings that arrive after a window has closed.

For new recordings where no window exists, the system initializes a fresh window and marks it for immediate processing:

if window is None:
    window = EventWindow(
        device_id=device_id,
        state=EventState.COLLECTING,
        start_time=bookmark_start_time,
        first_bookmark_id=bookmark_id,
        site_name=site_name,
        window_closing_time=event_time + timedelta(seconds=self.window_duration),
        recording_count=1
    )
    return {"action": "PROCESS_FIRST", "window": window.to_redis_dict()}

When a recording arrives after an existing window’s closing time, the system creates a new window, ensuring continuous but separate event tracking:

1
2
3

if event_time > window.window_closing_time:
    new_window = EventWindow(...)
    return {"action": "PROCESS_FIRST", "window": new_window.to_redis_dict()}

For recordings within an active window, the system implements sophisticated handling based on timing:

if window.state == EventState.COLLECTING:
    window.recording_count += 1
    total_duration = window.recording_count * 10
    time_to_close = (window.window_closing_time - current_time).total_seconds()

    if time_to_close <= 5:
        window.state = EventState.COMPLETED
        return {
            "action": "PROCESS_LAST",
            "window": window.model_dump(),
            "total_duration": total_duration
        }

    return {
        "action": "COUNT_ONLY",
        "window": window.to_redis_dict(),
        "total_duration": total_duration
    }

State Management

The implementation manages event processing through clearly defined states and transitions:

The COLLECTING state indicates an active window that’s accumulating recordings. During this state, the system tracks the recording count and calculates the total duration of the event sequence.
When a recording arrives within five seconds of the window’s closing time, the system transitions to a COMPLETED state and triggers final processing. This transition ensures proper closure of event sequences and initiates comprehensive analysis.
For recordings that arrive after a window has closed or when no window exists, the system creates a new window and begins fresh event tracking.

Results

The solution has significantly improved system on handle security camera recordings.

Enhanced Video Recording: When our system identifies a “PROCESS_LAST” event, we can now request a new video recording that spans the entire event duration. Instead of analyzing a standard 10-second clip, we generate a recording with a duration calculated by multiplying the total number of recordings in the window by 10 seconds.
Improved Analysis Quality: The extended duration of our final recording enables more accurate and comprehensive event analysis. Security personnel can now observe complete event sequences rather than fragmented clips, leading to better incident assessment.
Enhanced User Experience: Security personnel now receive more meaningful notifications that include complete event context. Instead of receiving multiple alerts for what is essentially one event, they get an initial notification followed by a comprehensive summary based on the complete sequence of activities.

Future Enhancement

The current implementation of time-window based event processing system has proven effective, but there are two significant opportunities for enhancement that would further improve its capabilities and intelligence.

Automated Window Processing through Scheduled Checks:
While the current system successfully processes events when recordings arrive within the last five seconds of a window, this approach has limitations. We can enhance this by implementing a scheduled check mechanism that automatically processes windows regardless of final event timing.

This scheduled processor would run regularly, scanning for active windows that have reached their closing time. This approach ensures we capture and analyze all event sequences, even when the final recording occurs well before the window closes. The automated system would maintain the same processing logic but remove the dependency on event timing for window closure.
Dynamic Window Extension for High-Risk:
When the initial recording in a window is classified as high-risk, the system automatically extends the monitoring period beyond the standard two-minute window. This extension ensures we capture more context around potentially serious security events, leading to a more comprehensive threat assessment.