|

Ground Truth

|

Jun 2, 2025

3D-as-a-Service: Ultra High Fidelity Groundtruth For AI Training and Validation

Brad Rosen

COO & Cofounder

Nodar Ground Truth image

As artificial intelligence drives progress in autonomous mobility, robotics, and mapping, one foundational ingredient remains critical: high-quality training data. For AI models that interpret the physical world, success depends on precise ground truth - 3D information about the scene and every object in it, including precise distance measurements to every object, feature, and point.

But achieving that kind of accuracy, scale, and alignment, especially in real-world outdoor environments, remains one of the hardest problems in the machine learning workflow.

LiDAR, the traditional tool for 3D data collection, is expensive, low resolution, and difficult to integrate with camera imagery. Synthetic datasets are helpful, but they often fail to generalize. And while 2D image annotation has matured, these solutions do not offer high-resolution, per-pixel depth labeling aligned with RGB video, which is what many AI perception systems require.

This is where NODAR Ground Truth comes in. NODAR Ground Truth is a depth annotation service that uses stereo vision to produce dense, pixel-aligned 3D pointclouds from 2D video. It transforms stereo camera footage into accurate 3D Ground Truth that enables better training of monocular depth networks, stereo vision systems, localization models, end-to-end perception algorithms and full-stack AV pipelines. It can also be used to validate the performance of existing AI systems.

By leveraging passive, off-the-shelf cameras and patented stereo processing algorithms, NODAR Ground Truth offers 10–100× better point cloud density than LiDAR, all from raw video data.

What's NODAR Ground Truth?

NODAR Ground Truth is a cloud-based software platform that turns synchronized video from two cameras into high-fidelity 3D datasets. At its core, it is a labeling solution purpose-built for depth annotation, assigning an accurate 3D value to every pixel in an image.

Here’s how it works:

  • The system receives raw video from two RGB cameras with overlapping Fields Of View (FOV)

  • Each image pair is processed as a stereo vision frame

  • Using sub-pixel accurate calibration and stereo matching, NODAR Ground Truth generates full-resolution depth maps and dense 6D point clouds (xyzrgb)

If a customer doesn’t have cameras with overlapping fields of view, NODAR offers a factory-calibrated stereo vision logger - the Hammerhead Development Kit (HDK) - that makes it easy to capture stereo data in the field.

What sets NODAR Ground Truth apart from conventional labeling services is the automated approach to creating 3D from 2D using NODAR’s patented advanced stereo vision algorithms. Each image-pair is automatically calibrated, rectified (row-aligned), and passed through NODAR’s proprietary stereo matching algorithm which compares the images geometrically to measure depth to each pixel.


Input: RGB (upper left); Outputs: Depth Map (upper right), 3D Pointcloud (bottom)

Key Features of the Platform

NODAR Ground Truth offers four primary capabilities:

Cloud-Based Processing
Ground Truth processing runs on a secure, cloud-hosted platform optimized for scale and throughput. Large datasets can be processed asynchronously and securely, with results returned in industry-standard formats.

Sub-Pixel Calibration
The platform uses patented algorithms to calibrate and rectify every image pair, enabling exceptionally crisp, long range, and accurate point clouds, even in low-visibility and high-vibration environments.

ML-Ready Data
NODAR Ground Truth produces dense 6D (xyzrgb) point clouds and full-resolution depth maps, ready to be used directly in AI/ML pipelines for training and evaluation.

Ultra-Long-Range Depth from Wide-Baseline Stereo
NODAR Ground Truth supports all stereo baseline configurations—from 10 cm to 3 m—but is uniquely capable of enabling wide-baseline stereo, which allows accurate long-range depth perception

Why Baseline Matters in Stereo Vision

In stereo vision, the baseline—the distance between the two cameras—has a direct impact on depth precision. A wider baseline creates greater parallax, which increases the ability to distinguish depth at long distances. In essence, the range and accuracy of a stereo vision system are proportional to the distance between the cameras. Said differently, a system with cameras placed 1m apart will have 10x the accuracy of a system that has cameras separated by 10cm.

NODAR Ground Truth is the only Ground Truth service that supports wide-baseline stereo vision at scale, and it does so without requiring expensive hardware or multi-sensor fusion.

  • Wider baseline = better range and resolution.

  • Detect objects over 200 meters away.

  • Achieve sub-percent depth accuracy exceeding 200m.

  • Maintain perfect alignment with RGB imagery.

  • Avoid the common sparsity and temporal misalignment issues of LiDAR.

Key Use Cases for NODAR Ground Truth

The platform supports a wide variety of machine learning and computer vision workflows:

Train Monocular Depth Networks
Use stereo-derived Ground Truth to train or fine-tune monocular depth models. Dense, pixel-aligned data enables higher accuracy in diverse scenes.

Ground Truth for Localization and Mapping
Generate precise, real-world 3D maps and spatial references for evaluating SLAM, odometry, and localization models, especially in unstructured or GNSS-denied environments.

Train Stereo Vision Networks
Refine stereo disparity models with high-quality Ground Truth matched to your exact baseline and camera configuration.

End-to-End Autonomous Driving Stack Training
Enable full AV/ADAS pipelines with accurate RGB + depth data for object detection, semantic segmentation, lane estimation, and path prediction.

Data Capture and Workflow

NODAR Ground Truth can be used in two ways:

1. Use Your Own Cameras

If you have a stereo camera setup with overlapping fields of view, you can upload time-synchronized footage to the cloud or process it on-premises.

2. Use the Hammerhead Development Kit (HDK)

NODAR’s stereo vision logger includes:

  • Pre-calibrated stereo cameras

  • Synchronization cables

  • Mounting options

  • Upload and processing software

Processing Pipeline

The Ground Truth workflow is built for simplicity and scalability:

  1. Capture: record synchronized stereo video

  2. Upload: transfer files to the NODAR cloud or on-prem instance

  3. Process: the Hammerhead engine generates point clouds, depth maps, and confidence maps

Deliverables include:

  • Full-resolution depth maps

  • 3D point clouds (xyzrgb)

Ideal for Non-Real-Time Ground Truth Generation

While NODAR Ground Truth is not a real-time perception system, it is ideal for offline data generation in research, model training, or simulation validation:

  • AV / ADAS R&D

  • Simulation and synthetic data benchmarking

  • Infrastructure monitoring

  • Construction and mining analysis

  • Safety system prototyping

Because NODAR Ground Truth uses only passive RGB cameras, it is inherently cost-effective, robust, and easy to deploy across varied environments.

Summary: Scalable Depth Annotation for Modern AI

High-fidelity 3D Ground Truth is essential for developing safe, generalizable AI systems, but producing that data at scale has been an ongoing bottleneck.

NODAR Ground Truth solves this by providing a software-defined, stereo-based labeling solution that is:

  • Accurate at long range

  • Dense and pixel-aligned

  • Scalable across environments

  • Compatible with any stereo baseline

  • Format-ready for machine learning workflows

By transforming raw stereo video into ML-ready 3D data, NODAR Ground Truth unlocks better training for perception systems, faster validation for autonomous platforms, and more cost-effective data capture workflows for the real world.

Learn more: https://www.nodarsensor.com/products/groundtruth