S³KF: Spherical State-Space Kalman Filtering for Panoramic 3D Multi-Object Tracking

Zhongyuan Liu¹ Shaonan Yu² Jianping Li² Pengfei Wan² Xinhang Xu² Pengfei Wang³ Maggie Y. Gao² Lihua Xie²
¹Certaintyx ² Nanyang Technological University ³ ST Engineering

What can S³KF do?

Click here to watch the full video on YouTube

S³KF: Spherical State-Space Kalman Filtering for Panoramic 3D Multi-Object Tracking. (a) We propose a panoramic 3D multi-object tracking framework fusing rotating LiDAR and fisheye camera array. (b) We design a geometrically consistent state representation on the unit sphere S² to avoid projection singularity and over-parameterization. (c) We build an extended spherical Kalman filter to fuse visual appearance and LiDAR depth for robust panoramic tracking.

Abstract

Panoramic 3D perception is crucial for safety monitoring and autonomous operation in industrial scenarios, requiring full-field coverage while maintaining geometric consistency and real-time performance.

S³KF presents a unified panoramic 3D multi-object tracking framework that fuses an electric rotating LiDAR with a four-camera array for synchronous full-surround geometry and appearance perception. Unlike traditional 2D trackers that suffer from panoramic image projection singularities or 3D trackers relying on redundant Euclidean parameterization, S³KF designs a geometrically consistent state representation on the unit sphere S².

Experiments in controlled and real-world industrial scenes show that the method achieves decimeter-level tracking accuracy, significantly reduces identity switches, and maintains real-time performance on-board, providing a scalable, infrastructure-free solution for industrial safety monitoring and panoramic multi-object tracking.

Research Highlights

360° Panoramic Perception: Novel integration of rotating LiDAR and quad-camera array for full-surround synchronized sensing without blind spots.
Geometrically Consistent Representation: Spherical state model on unit sphere S² eliminates projection singularities and over-parameterization inherent in traditional 2D/3D methods.
Multi-Modal Sensor Fusion: Extended spherical Kalman filter seamlessly fuses visual detection, LiDAR depth, and motion information for robust tracking.
Infrastructure-Free & Real-Time: Achieves decimeter-level accuracy with minimal deployment requirements, suitable for UAVs and ground robots in dynamic environments.
Open-Source Dataset: High-precision ground truth with wearable LiDAR system, enabling reproducible research for panoramic 3D tracking.

Key Features & Contributions

🔧 Panoramic Sensing Hardware

A novel integrated system of rotating LiDAR and quad-camera rig, realizing full panoramic 3D coverage and synchronized multimodal sensing.

🌐 S² Geometric State Representation

2-DOF tangent-plane parameterization on unit sphere, avoiding redundant constraints and enabling unified 2D/3D tracking representation.

📡 Extended Spherical Kalman Filter

Augmented state space with scale/depth and their velocities, enabling principled camera-LiDAR data fusion via an extended spherical Kalman filter.

📊 Infrastructure-Free GT Acquisition

Novel 3D trajectory GT method with wearable LiDAR, releasing an open-source dataset with synchronized multimodal data and high-precision GT.

System Overview

Hardware Setup: Rotating Livox Mid360 LiDAR + 4-channel fisheye camera array + embedded computing unit. The system supports 360° panoramic sensing and can be mounted on UAVs, quadruped robots, and mobile platforms for infrastructure-free multi-object tracking.

GROUND TRUTH GENERATION FOR 3D TRACKING

A. Hardware and System Configuration

Our infrastructure-free mobile localization system integrates LiDAR sensors and computing units worn on the head. All devices connect via WiFi to maintain synchronized time and spatial references, enabling rapid deployment across multiple platforms without pre-deployed base stations.

B. 3D Tracking Ground-truth Generation

We construct a unified global coordinate frame from a high-quality LiDAR point cloud map using LiDAR-inertial odometry. Wearable devices localize by registering their real-time LiDAR scans to this map, achieving centimeter-level accuracy (within 3 cm). This eliminates inter-device calibration and avoids cumulative drift during experiments, enabling scalable multi-person trajectory acquisition.

Real-world Experiments

Panoramic 3D Tracking: S³KF is validated in indoor, outdoor, dynamic and complex environments. It achieves centimeter-to-decimeter localization accuracy and drastically reduces ID switches compared with 2D trackers like ByteTrack.

Field Testing Results

Indoor Environment with Dog platforms

Indoor Environment with Drone platforms

Outdoor Environment with Dog platforms

Outdoor Environment with Drone platforms

Conclusion

S³KF provides a unified framework for panoramic 3D multi-object tracking by introducing spherical geometry and multi-modal fusion. It effectively addresses projection distortion, state redundancy, and unstable filtering in traditional methods, enabling robust, real-time, infrastructure-free panoramic perception for robotics and industrial applications.

S3KF: Spherical State-Space Kalman Filtering for Panoramic 3D Multi-Object Tracking

What can S3KF do?