Direct Linear Transformation (DLT)
Recording images using a camera is equivalent to mapping object point O in the object space to image point I' in the film plane (Fig. 1a). For digitization, this recorded image will be projected again to image I in the projection plane (Fig. 1b).
But, for simplicity, it is possible to directly relate the projected image and the object (Fig. 2). Object O is mapped directly to the projected image I. The projection plane is called image plane. Point N is the new node or projection center.
Two reference frames are defined in Fig. 2: object-space reference frame (the XYZ-system) and image-plane reference frame (the UV-system). The optical system of the camera/projector maps point O in the object space to image I in the image plane. [x, y, z] is the object-space coordinates of point O while [u, v] is the image-plane coordinates of the image point I. Points I, N & O thus are collinear. This is the so-called collinearity condition, the basis of the DLT method.
Now, assume that the position of the projection center (N) in the object-space reference frame to be [xo, yo, zo] (Fig. 3). Vector A drawn from N to O then becomes [x - xo, y - yo, z - zo].
Add axis W to the image plane reference frame as the third axis to make the image-plane reference frame 3-dimensional (Fig. 4). The W-coordinates of the points on the image plane are always 0, and the 3-dimensional position of point I becomes [u, v, 0].
A new point P, the principal point, was introduced in Fig. 4. The line drawn from the projection center N to the image plane, parallel to axis W and perpendicular to the image plane, is called the principal axis and the principal point is the intersection of the principal axis with the image plane. The principal distance d is the distance between points P and N. Assuming the image plane coordinates of the principal point to be [uo, vo, 0], the position of point N in the image-plane reference frame becomes [uo,vo,d]. Vector B drawn from point N to I is becomes [u-uo, v-vo,-d].
Since points O, I, and N are collinear, vectors A (Fig. 3) and B (Fig. 4) form a single straight line. The collinearity condition is simply equivalent to the vector expression
where c = a scaling scalar. Note here that vectors A and B were originally described in the object-space reference frame and the image-plane reference frame, respectively. In order to directly relate the coordinates, it is necessary to describe them in a common reference frame. One good way to do this is to transform vector A to the image-plane reference frame:
where A(I) = vector A described in the image-plane reference frame, A(O) = vector A described in the object-space reference frame, and TI/O = the transformation matrix from the object-space reference frame to the image-plane reference frame. Apply  to :
From , obtain
Substitute  for c in :
Note that u, v, uo & vo in  are the image plane coordinates in the real-life length unit, such as cm. In reality, however, the digitization system may use different length units, such as pixels, and  must accommodate this:
where = the unit conversion factors for the U and V axis, respectively. As a result, u, v, uo & vo in  can be in any units. Also note the two unit conversion factors in  can be different from each other.
Now, rearrange  for x, y, and z:
Coefficients L1 to L11 in  are the DLT parameters that reflect the relationships between the object-space reference frame and the image-plane reference frame.
3-D DLT Method
 is the standard 3-D DLT equation, but one may include in  the optical errors from the lens:
where = the optical errors. Optical errors can be expressed as
Among the five additional parameters shown in , L12 - L14 are related to the optical distortion while L15 & L16 are for the de-centering distortion (Walton, 1981):
There are two different ways to use  in 3-D DLT method: camera calibration and raw coordinate computation.
Rearrange  to obtain
 is equivalent to
Expand  for n control points:
In , it was assumed that the object-space coordinates, [xi, yi, zi], were all known. A group of control points whose x, y & z coordinates are already known must be employed for this. The control points must not be co-planar. In other words, the control points must form a volume, the control volume. The control points are typically fixed to a calibration frame or control object.
In case less than 16 parameters must be used, discard the unused rows and columns from . Feasible choices are 11, 12, 14, and 16.
Note also that the coefficient matrix in  requires Ri. which is a function of L9 - L11. It is impossible to directly solve this system and an iterative approach must be used. L9 - L11 obtained from the previous iteration can be used in computing Ri in the current iteration.
 is basically in the form of
The DLT parameters can be obtained using the least square method:
See the Least Square Method page for details of the least-square approach.
To obtain the DLT parameters and the additional parameters using the least square method,  must be over-determined (number of equations > number of unknowns). Since each control point provides 2 equations, the minimum number of control points required are
Rearrange  for x, y & z:
 is equivalent to the matrix expression
Expand  for m cameras:
and . Again, the least square method described in  and  can be used in computing the 3-D coordinates of the markers on the subject's body.
Camera Position and the Principal Point
Similarly, from :
Both  and  are based on the orthogonality of the transformation matrix TI/O:
See the Transformation Matrix page for details of the orthogonality.
Scale Factors and the Transformation Matrix
To obtain the transformation matrix in , du and dv must be computed. From  and :
It is safe to assume in . However, D can be either positive or negative, as shown in . Use the positive value in  first and compute the determinant of the transformation matrix obtained. If the determinant is positive (right-handed system), D must be positive and the current matrix is all right. If the determinant is negative (left-handed), D must be negative. Multiply -1 to the matrix obtained previously. Three Eulerian angles may be computed from the nine elements of the transformation matrix. See the Eulerian Angles page for details.
Note here that the DLT parameters computed using the least square method, , does not automatically guarantee an orthogonal transformation matrix due to the experimental errors. This is an intrinsic problem that the DLT method has. The Modified DLT method proposed by Hatze (1988) addressed this problem. See the Modified DLT page for details.
2-D DLT Method
In the case of 2-D analysis, the z-coordinate is always 0 and the mapping from the object-plane reference frame into the image-plane reference frame reduces to
Apply  to n (n >= 4) control points and m (m >= 1) cameras:
The object plane and the image plane do not have to be parallel. 2-D DLT guarantees accurate plane- to-plane mapping regardless of the orientation of the planes. The control points must not be collinear and must form a plane.
The accuracy of camera calibration and reconstruction can be assessed by computing the calibration error and/or the reconstruction error. The calibration error of a given camera is defined as
The DLT and additional parameters obtained through the calibration can be applied back to the control points for the computation of their reconstructed coordinates. The reconstruction error is the deviation of the reconstructed coordinates from the measured:
where = reconstructed coordinates of the control point.
A self-extrapolation scheme can be employed to improve the reliability of the calibration/reconstruction error (Kwon, 1989). Only half of the control points are used in the computation of the parameters while all points are used in the reconstruction. The reconstruction error computed in this way is more reliable and better reflects the actual object-space deformation error. The minimum number of control points required for self-extrapolation depends on the no. of parameters:
There are several important issues:
References & Related Literature
Abdel-Aziz, Y.I., & Karara, H.M. (1971). Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry. Proceedings of the Symposium on Close-Range Photogrammetry (pp. 1-18). Falls Church, VA: American Society of Photogrammetry.
Chen, L. (1985). A selection scheme for non-metric clase-range photogrammetric systems. Unpublished Doctoral Dissertation, University of Illinois, Urbana-Champaign.
Chen, L., Armstrong, C.W., & Raftopoulos, D.D., (1994). An investigation on the accuracy of three-dimensional space reconstruction using the direct linear transformation technique. J. Biomech 27, 493-500.
Dapena, J., Harman, E.A., and Miller, J.A. (1982). Three-dimensional cinematography with control object of unknown shape. J. Biomech 15, 11-19.
Hatze, H. (1988). High-precision three-dimensional photogrammetric calibration and object space reconstruction using a modified DLT-approach. J. Biomech 21, 533-538.
Hinrichs R.N., and McLean, S.P. (1995). NLT and extrapolated DLT: 3-D cinematography alternatives for enlarging the volume of calibration. J. Biomech 28, 1219-1224.
Kwon, Y.-H. (1994). KWON3D Motion Analysis Package 2.1 User's Reference Manual. Anyang, Korea: V-TEK Corporation.
Kwon, Y.-H. (1989). The effects of different control point conditions on the DLT calibration accuracy. Unpublished class project report, Pennsylvania State University.
Marzan, G.T. & Karara, H.M. (1975). A computer program for direct linear transformation solution of the collinearity condition, and some applications of it. Proceedings of the Symposium on Close-Range Photogrammetric Systems (pp. 420-476). Falls Church, VA: American Society of Photogrammetry.
Miller, N.R., Shapiro, R., & McLaughlin, T.M. (1980). A technique for obtaining spatial kinematic parameters of segments of biomechanical systems from cinematographic data. J. Biomech 13, 535-547.
Shapiro, R. (1978). Direct linear transformation method for three-dimensional cinematography. Res. Quart. 49, 197-205.
Walton, J.S. (1981). Close-range cine-photogrammetry: a generalized technique for quantifying gross human motion. Unpublished Ph.D. Dissertation, Pennsylvania State University, University Park.
© Young-Hoo Kwon, 1998-