In the world of imaging and mapping the 3D world into a 2D representation relies on mathematical models, transformations, and calibration processes. These processes are driven by concepts like camera projection, intrinsic and extrinsic parameters, and techniques for accurate calibration. Understanding these principles is critical for camera calibration and for applications like autonomous vehicles, robotics, and more.
In this blog, you’ll get expert insights into these concepts as we break down the mathematical foundation and practical applications of camera projection.
From 3D to 2D: The Concept of Camera Projection
At its core, camera projection involves mapping 3D points in the real world onto a 2D image plane. The transformation is often modeled using the pinhole camera model, which treats the camera as a simple device with a single aperture through which light enters.
The process can be summarized as:
x=P.X
Here:
-
- x: 2D pixel coordinates on the image plane
- P: The camera projection matrix (encapsulating intrinsic and extrinsic parameters)
- X: 3D world coordinates in homogeneous form
This transformation involves:
- Extrinsic transformation: Positions the camera relative to the scene by rotating and translating the 3D points.
- Intrinsic transformation: Maps these transformed points onto the 2D image plane using the camera’s internal properties.
Coordinate Systems in Camera Modeling
The journey of a 3D point to its 2D representation spans multiple coordinate systems:
- World coordinate system: Defines the 3D scene using a global reference frame.
- Camera coordinate system: Positions the camera as the origin, where the world is transformed relative to it.
- Image plane: Involves a 2D plane where the camera projects the 3D points, capturing the scene.
- Pixel coordinate system: Represents the digital image in discrete pixel values, linking image plane coordinates to actual pixels.
Each transformation between these systems is guided by intrinsic and extrinsic parameters.
The Building Blocks of Camera Projection
Intrinsic Parameters: Defining the Camera’s Internal Properties
Intrinsic parameters describe the internal characteristics of a camera, encapsulating how it transforms 3D coordinates in the camera space to 2D coordinates on the image plane. These include:
- Focal length (fx,fy): Represents the distance from the lens to the image sensor, dictating the scale of projection.
- Principal point (Cx,Cy): The center of the image plane where the optical axis intersects.
- Aspect ratio and skew (s): Corrects any deviation from orthogonality between image axes.
- Distortion parameters: Address lens distortions, such as barrel distortion (curving outward) or pincushion distortion (curving inward).
The intrinsic matrix K combines these parameters:
Extrinsic Parameters: Locating the Camera in Space
These parameters define the relative position and orientation of the camera. They include:
- Rotation (R): A 3×3 orthogonal matrix representing the camera’s orientation.
- Translation (t): A 1×3 column vector that specifies the camera’s displacement from the origin.
Combined as:
[R|t]
These parameters enable seamless integration of the camera into a 3D environment. This completes the mapping of the 3D world onto a 2D image.
Figure 2: Checkerboard – Points Detection [Image Source: pho1-22-Zhang-calibration.pptx ]
Camera Calibration: The Projection Matrix
Calibration helps determine the intrinsic and extrinsic parameters to ensure accurate mapping from the 3D world to the 2D image.
Intrinsic calibration
This step focuses on determining parameters like focal length, principal point, and lens distortion. Calibration involves capturing images of a known pattern (e.g., a checkerboard) and applying algorithms like Zhang’s method to compute the camera matrix.
Extrinsic calibration
Here, the goal is to estimate the camera’s position and orientation in the scene. By matching known 3D world points with their 2D projections, the extrinsic parameters can be derived.
Direct Linear Transform (DLT)
The DLT method is the most important milestone of calibration, leveraging correspondences between world points and image points. It formulates equations for each correspondence and solves for the projection matrix using linear algebra techniques like QR factorization.
The Role of the Pinhole Camera Model
The pinhole camera model simplifies the projection process by ignoring lens distortion and assuming perfect optics. While idealized, it forms the basis for understanding the mathematics of projection.
In this model:
- Light rays converge at a single point (the pinhole) and pass through to the image plane.
- The resulting image is inverted and scaled, with the scaling determined by the focal length.
Real-world applications often extend this model by introducing distortion corrections.
Lens Distortion and Its Correction
No lens is perfect, and distortions can significantly affect image quality. Common types include:
- Radial distortion: Straight lines seem to be curved
- Barrel distortion: Lines curve outward from the center
- Pincushion distortion: Lines curve inward
Figure 4: Types of Lens Distortion and Its Correction
- Tangential distortion: Occurs due to lens misalignment, shifting the image asymmetrically
Figure 5: Tangential Distortion [Image Source: pho1-22-Zhang-calibration.pptx ]
Mathematical models for correction typically add terms to the intrinsic calibration process to account for these distortions.
Applications of Camera Projection
- 3D reconstruction: Recovering the 3D geometry of a scene from 2D images.
- Robotic vision: Enabling robotic perception, navigation and interaction with their environment.
- Augmented reality: Overlaying digital content onto the physical world.
- Autonomous vehicles: Guiding vehicles using accurate camera-based 3D-to-2D perception and mapping for obstacle detection and navigation.
e-con Systems: A Pioneer in Delivering OEM Camera Solutions
e-con Systems has been designing, developing, and manufacturing OEM cameras since 2003. We offer state-of-the-art camera solutions, including MIPI camera modules, GMSL cameras, USB 3.1 Gen 1 cameras, stereo cameras, ToF cameras, and more. Over time, we have helped transformed how industries approach embedded vision. These include retail, medical and life sciences, industrial, agriculture, smart cities, and more.
Use our Camera Selector to check out our full product portfolio.
As always, if you require help integrating camera solutions into your embedded vision systems, please write to camerasolutions@e-consystems.com.
Prabu is the Chief Technology Officer and Head of Camera Products at e-con Systems, and comes with a rich experience of more than 15 years in the embedded vision space. He brings to the table a deep knowledge in USB cameras, embedded vision cameras, vision algorithms and FPGAs. He has built 50+ camera solutions spanning various domains such as medical, industrial, agriculture, retail, biometrics, and more. He also comes with expertise in device driver development and BSP development. Currently, Prabu’s focus is to build smart camera solutions that power new age AI based applications.