AI-Powered CCTV and Public Safety — Seoul's Integrated Surveillance and Crime Prevention Network

The Scale of Seoul’s Surveillance Infrastructure

Seoul operates one of the densest urban CCTV networks in the developed world. As of early 2026, 6,800 cameras feed live video into the TOPIS transport and city management hub, covering arterial intersections, expressway segments, bus-only lanes, subway station entrances, public parks, school zones, and high-pedestrian-traffic commercial corridors. These 6,800 cameras represent only the city-government-managed layer; when private security cameras in apartment complexes, commercial buildings, and retail establishments are included, the total number of cameras in Seoul exceeds 1.1 million — roughly one camera for every nine residents.

The distinction between municipal and private cameras matters operationally. Municipal cameras are networked into TOPIS and accessible to the Seoul Metropolitan Police; private cameras are not, and accessing their footage requires a court order or the property owner’s consent under South Korea’s Personal Information Protection Act (PIPA). The AI-powered analytics discussed in this article apply exclusively to the 6,800 municipal cameras integrated into the city’s smart-city infrastructure.

From Passive Monitoring to AI-Driven Detection

First-generation CCTV in Seoul was purely reactive. Operators in the TOPIS control room watched banks of monitors and manually flagged incidents — a model that depended entirely on human attention and became less effective as the camera count grew. A control-room operator monitoring 20 feeds simultaneously has an average incident-detection rate of 45 percent within the first five minutes; beyond 30 feeds, detection rates fall below 20 percent. Scaling human monitoring to 6,800 cameras would require hundreds of operators per shift — economically impractical and operationally unreliable.

AI video analytics changed the equation. Starting in 2018, the Seoul Metropolitan Government began deploying edge AI accelerators (initially Intel Movidius and later NVIDIA Jetson-class modules) at camera locations, enabling real-time computer-vision inference directly on the camera hardware. Rather than streaming raw video to the TOPIS control room for human review, each camera now processes its own feed locally, detecting and classifying objects, tracking motion patterns, and flagging anomalous events. Only metadata and alert notifications — plus short video clips associated with flagged events — are transmitted to TOPIS, reducing bandwidth requirements by roughly three orders of magnitude compared to raw-video transmission.

The AI models running on each camera are multi-task: a single inference pass detects vehicles (classified by type), pedestrians (classified by activity — walking, running, standing, fallen), and environmental anomalies (smoke, fire, flooding). Specialized models handle additional tasks at specific camera locations: ANPR (automatic number-plate recognition) at enforcement cameras, crowd-density estimation at public-gathering sites, and perimeter-breach detection at restricted-access facilities.

AI Detection Capability	Model Architecture	Edge Hardware	Latency
Vehicle detection & classification	YOLOv8 fine-tuned	NVIDIA Jetson Orin	< 50 ms
Pedestrian detection & activity	YOLOv8 + pose estimation	NVIDIA Jetson Orin	< 80 ms
ANPR (plate recognition)	OCR pipeline (CRNN)	NVIDIA Jetson Orin	< 200 ms per plate
Crowd density estimation	CSRNet (dilated CNN)	NVIDIA Jetson AGX	< 100 ms
Smoke / fire detection	Custom CNN binary classifier	NVIDIA Jetson Nano	< 150 ms
Anomalous motion (loitering, collapse)	Trajectory analysis (LSTM)	NVIDIA Jetson Orin	< 300 ms

Crime Prevention and Deterrence

The primary public-safety justification for Seoul’s CCTV network is crime prevention. South Korea’s overall crime rate is low by international standards — the violent-crime rate is roughly one-fifth that of the United States — but specific crime categories, particularly sexual offenses, stalking, and violent assaults in entertainment districts, have driven public demand for more visible surveillance.

The AI-powered system supports crime prevention through three mechanisms.

Deterrence through visibility. Cameras are deliberately made visible rather than concealed, with illuminated housing and signage notifying the public of recording. Studies conducted by the Seoul Institute of Technology found that camera installation in previously uncovered areas correlated with a 15–25 percent reduction in reported street crime within the first 12 months, an effect attributed primarily to deterrence (potential offenders modifying behavior) rather than detection and arrest.

Real-time alerting for high-risk behaviors. The trajectory-analysis model flags specific behavioral patterns associated with elevated risk. A person following another person at a consistent distance for more than 200 meters triggers a “following alert.” A person loitering near a school entrance outside of school hours triggers a “school zone alert.” An individual lying motionless on the ground for more than 60 seconds triggers a “medical emergency alert.” These alerts appear on the TOPIS operator’s console with the associated video clip; the operator assesses the situation and dispatches police or emergency medical services as appropriate. False-positive rates for behavioral alerts are higher than for vehicle-detection tasks — approximately 30 percent of following alerts and 40 percent of loitering alerts are dismissed by operators as benign — but the system is calibrated for high recall (catching genuine incidents) at the expense of precision (accepting more false positives).

Post-incident investigation support. When a crime is reported, investigators can query the TOPIS system for all camera footage within a specified radius and time window. AI-assisted search allows investigators to filter footage by vehicle type and color, pedestrian clothing color, and direction of travel — dramatically reducing the time required to trace a suspect’s movements compared to manual frame-by-frame review. A task that previously took 8–12 hours of analyst time can now be completed in under 30 minutes.

Disaster Response and Emergency Management

Public safety in Seoul extends beyond crime to natural disasters, infrastructure failures, and mass-casualty events. The CCTV network serves as the visual-intelligence layer for the city’s emergency response framework.

Flood monitoring. Cameras at low-lying underpasses, riverbank areas, and known flood-prone zones monitor water levels visually. The AI system detects rising water by tracking the boundary between wet and dry surfaces in the camera frame, triggering flood warnings before water-level gauges (which may be spaced hundreds of meters apart) register the rise. During the August 2022 Seoul flooding event — which killed 14 people, primarily in semi-basement apartments — post-incident analysis identified gaps in camera coverage along the Dorimcheon stream that have since been closed.

Fire detection. Smoke and flame detection models running on cameras in parks, forested hillsides (Seoul is approximately 70 percent mountainous terrain), and commercial districts provide early-warning capability that supplements the fire department’s conventional alarm-based detection. A camera detecting smoke in a forested area on Bukhansan (the mountain straddling Seoul’s northern border) can alert the fire dispatch center 3–5 minutes faster than a civilian phone call, particularly during overnight hours when foot traffic is minimal.

Crowd management. The 2022 Itaewon Halloween crowd crush, which killed 159 people, fundamentally changed Seoul’s approach to crowd monitoring. AI crowd-density estimation models now run on cameras covering all major gathering sites: Hongdae, Itaewon, Gangnam Station, Myeongdong, Gwanghwamun Plaza, and Yeouido during festival periods. When estimated density exceeds a configurable threshold (currently set at 5 persons per square meter, the level associated with dangerous crowd compression), the system triggers alerts to both the TOPIS control room and the Seoul Metropolitan Police’s crowd-management division. Police can then activate crowd-control measures — closing subway exits, deploying barriers, broadcasting dispersal announcements — before conditions deteriorate to life-threatening levels.

Emergency Application	Detection Method	Response Trigger	Lead Time vs. Manual
Flood warning	Visual water-level tracking	Automated alert to TOPIS + emergency broadcast	5–10 minutes earlier
Wildfire detection	Smoke/flame CNN classifier	Alert to fire dispatch	3–5 minutes earlier
Crowd crush prevention	Density estimation (CSRNet)	Alert at > 5 persons/m²	Real-time (no manual equivalent)
Building collapse	Structural anomaly + dust plume	Alert to TOPIS + fire department	1–2 minutes earlier

Traffic Enforcement — The Sub-10-Second Pipeline

As documented in the TOPIS article, the CCTV network is the enforcement backbone for Seoul’s traffic laws. The complete pipeline from violation detection to fine issuance operates in under ten seconds.

The workflow proceeds as follows: (1) a camera with ANPR detects a violation — bus-lane incursion, illegal parking in a clearway, speed exceedance, red-light running; (2) the edge AI captures a timestamped image sequence showing the vehicle, its plate, and the violation context; (3) the ANPR model extracts the plate number via OCR; (4) the plate is cross-referenced against the vehicle registration database; (5) a violation notice is generated with the registered owner’s details; (6) the notice is queued for issuance. Steps 1 through 6 complete in less than ten seconds. The notice is reviewed by a human auditor within 48 hours (a regulatory requirement) before mailing or electronic delivery.

Annual enforcement throughput across all categories reached 7.9 million automated violations in 2024, spanning bus-lane incursions (2.1 million), illegal parking (1.4 million), speed violations (3.8 million), and red-light running (620,000). The revenue generated — while significant — is secondary to the behavioral effect: bus-lane compliance on camera-monitored corridors exceeds 96 percent, compared to 72 percent on unmonitored corridors.

Integration With the Smart-City Ecosystem

The CCTV network is not a standalone system. Its feeds and analytics outputs integrate with multiple components of Seoul’s smart-city infrastructure.

TOPIS. CCTV is TOPIS’s primary visual-intelligence source. The 6,800 feeds provide the situational awareness that TOPIS operators use for traffic management, incident response, and coordination with police and fire services.
AI traffic management. Vehicle counts, turning-movement data, and queue-length measurements derived from CCTV feeds are primary inputs to the AI signal-optimization system. Without camera-based perception, the AI would rely solely on inductive loop detectors, which provide counts but not classifications, speeds, or queue extents.
S-DoT sensors. Acoustic anomaly detection at S-DoT nodes supplements camera-based detection. A gunshot, vehicle collision, or breaking glass detected by a noise sensor triggers the nearest camera to pan (if PTZ-capable) toward the sound source, providing visual confirmation of the audio alert.
S-Map digital twin. Camera locations and fields of view are mapped in S-Map, enabling gap analysis — identifying areas where building geometry, vegetation, or terrain creates blind spots in camera coverage. The Itaewon coverage gaps identified after the 2022 crowd crush were discovered through S-Map analysis.
Smart parking. Cameras in parking-enforcement zones detect vehicles that overstay paid parking periods, triggering violation notices through the same ANPR pipeline used for moving violations.
Digital government. Citizen complaints about safety concerns (poorly lit alleys, suspicious activity, damaged infrastructure) are routed to the public-safety division and cross-referenced with CCTV coverage maps to assess whether the reported location is already monitored.

Privacy Safeguards and Legal Framework

Operating 6,800 AI-enhanced cameras in a democracy requires a legal and ethical framework that balances public safety with individual privacy. South Korea’s Personal Information Protection Act (PIPA), enacted in 2011 and substantially amended in 2020, provides the statutory foundation.

Purpose limitation. Municipal CCTV may be operated only for specified purposes: crime prevention, facility safety, traffic enforcement, and disaster response. Using camera footage for purposes beyond these — tracking political protesters, monitoring personal relationships, commercial surveillance — is prohibited and subject to criminal penalties.

Edge processing for video analytics. Raw video from AI-equipped cameras is processed on the edge device at the camera location. Only structured metadata (vehicle counts, speeds, classifications, behavioral alerts) and short alert-associated video clips are transmitted to TOPIS. Raw video is stored on local solid-state drives at the camera site for 72 hours and then automatically overwritten unless flagged for retention in connection with a reported incident or enforcement action. This architecture ensures that the vast majority of video — showing ordinary people going about their daily lives — never leaves the camera pole.

No facial recognition in public spaces. The Seoul Metropolitan Government has not deployed facial-recognition technology on its municipal CCTV network. While the AI models detect and track pedestrians, they classify by activity and body characteristics (height range, clothing color) rather than by facial identity. This policy reflects both PIPA’s strict biometric-data provisions and public sentiment: a 2023 Seoul Institute survey found that 68 percent of residents supported CCTV for crime prevention but 71 percent opposed facial recognition in public spaces.

Independent oversight. The Seoul CCTV Operations Committee, a body composed of city council members, privacy advocates, legal scholars, and technology experts, reviews camera installation requests, audit logs showing who accessed which footage, and AI model specifications on a quarterly basis. The committee has rejection authority: in 2024, it denied 14 of 87 proposed new camera installations on grounds that the proposed locations lacked sufficient justification or overlapped with existing coverage.

Privacy Safeguard	Implementation
Purpose limitation	PIPA Article 25 — specified uses only
Edge processing	Raw video stays on camera; only metadata transmitted
72-hour retention	Auto-overwrite unless incident-flagged
No facial recognition	Policy commitment; body-only tracking
Independent oversight	CCTV Operations Committee, quarterly review
Public notification	Illuminated camera housing, signage at all sites

Challenges and Evolving Threats

AI model bias. Object-detection models trained primarily on daytime, clear-weather footage perform less reliably at night, in fog, and during heavy rain — precisely the conditions when crime risk and disaster risk are elevated. The SMG addresses this through continuous model retraining with adverse-condition datasets, but nighttime pedestrian detection accuracy still lags daytime performance by approximately 12 percentage points.

Adversarial attacks. Research has demonstrated that simple physical modifications — specific sticker patterns on clothing or vehicles — can cause YOLO-family detectors to misclassify or miss objects entirely. While no adversarial attack on Seoul’s CCTV network has been publicly reported, the theoretical vulnerability is acknowledged by the SMG’s cybersecurity division, which sponsors annual adversarial-robustness evaluations at KAIST and Seoul National University.

Scope creep. As AI capabilities improve, the temptation to expand surveillance applications grows. Gait recognition, emotion detection, and social-network inference from co-location patterns are all technically feasible with existing camera hardware and increasingly capable models. The CCTV Operations Committee’s role in reviewing and approving model specifications is intended to prevent scope creep, but the committee’s technical capacity to evaluate complex AI models is limited, and several privacy advocates have called for independent algorithmic audits by external technical organizations.

Maintenance at scale. The 6,800-camera network requires continuous maintenance — lens cleaning, housing repair, edge-device firmware updates, and network-connection troubleshooting. The SMG contracts maintenance to private vendors under service-level agreements requiring 95 percent camera uptime. Actual uptime averaged 92.3 percent in 2024, with the gap attributable primarily to construction-related fiber cuts and vandalism. The smart parking system’s in-ground sensors face similar maintenance challenges, suggesting that scaling IoT infrastructure reliably is a systemic issue across Seoul’s smart-city programs.

The Road to 2030

Seoul’s public-safety CCTV roadmap through 2030 prioritizes three directions. First, closing the coverage gaps identified after the 2022 Itaewon tragedy by installing approximately 800 additional cameras at major gathering sites and entertainment districts, bringing the municipal total to approximately 7,600. Second, deploying mobile CCTV platforms — drones and vehicle-mounted camera systems streaming over 5G — for temporary events (festivals, protests, sporting events) where permanent camera installation is not justified. Third, advancing AI model capabilities toward multi-camera tracking, where a person or vehicle can be followed across non-overlapping camera views using appearance re-identification rather than facial recognition — preserving the no-facial-recognition policy while improving investigative efficiency for serious crimes.

The ultimate goal is a public-safety system that prevents harm rather than merely documenting it — a system where AI detects rising risk (a crowd approaching dangerous density, a vehicle driving erratically, a flood approaching an underpass) fast enough for human responders to intervene before consequences become irreversible. Seoul is not there yet, but the combination of dense camera coverage, edge AI, sensor networks, and integrated command-and-control through TOPIS puts the technical foundation in place. The remaining challenges are institutional, ethical, and human — ensuring that the technology serves the public without becoming the kind of surveillance apparatus that the public rightly fears.

smart-cityseoul