
Direct Linear Transformation (DLT) Recording images using a camera is equivalent to mapping object point O in the object space to image point I' in the film plane (Fig. 1a). For digitization, this recorded image will be projected again to image I in the projection plane (Fig. 1b).
But, for simplicity, it is possible to directly relate the projected image and the object (Fig. 2). Object O is mapped directly to the projected image I. The projection plane is called image plane. Point N is the new node or projection center.
Two reference frames are defined in Fig. 2: objectspace reference frame (the XYZsystem) and imageplane reference frame (the UVsystem). The optical system of the camera/projector maps point O in the object space to image I in the image plane. [x, y, z] is the objectspace coordinates of point O while [u, v] is the imageplane coordinates of the image point I. Points I, N & O thus are collinear. This is the socalled collinearity condition, the basis of the DLT method. Now, assume that the position of the projection center (N) in the objectspace reference frame to be [x_{o}, y_{o}, z_{o}] (Fig. 3). Vector A drawn from N to O then becomes [x  x_{o}, y  y_{o}, z  z_{o}].
Add axis W to the image plane reference frame as the third axis to make the imageplane reference frame 3dimensional (Fig. 4). The Wcoordinates of the points on the image plane are always 0, and the 3dimensional position of point I becomes [u, v, 0].
A new point P, the principal point, was introduced in Fig. 4. The line drawn from the projection center N to the image plane, parallel to axis W and perpendicular to the image plane, is called the principal axis and the principal point is the intersection of the principal axis with the image plane. The principal distance d is the distance between points P and N. Assuming the image plane coordinates of the principal point to be [u_{o}, v_{o}, 0], the position of point N in the imageplane reference frame becomes [u_{o},v_{o},d]. Vector B drawn from point N to I is becomes [uu_{o}, vv_{o},d]. Since points O, I, and N are collinear, vectors A (Fig. 3) and B (Fig. 4) form a single straight line. The collinearity condition is simply equivalent to the vector expression , [1] where c = a scaling scalar. Note here that vectors A and B were originally described in the objectspace reference frame and the imageplane reference frame, respectively. In order to directly relate the coordinates, it is necessary to describe them in a common reference frame. One good way to do this is to transform vector A to the imageplane reference frame:
where A^{(I)} = vector A described in the imageplane reference frame, A^{(}^{O}^{)} = vector A described in the objectspace reference frame, and T_{I/O} = the transformation matrix from the objectspace reference frame to the imageplane reference frame. Apply [2] to [1]: , [3] or . [4] From [4], obtain [5] Substitute [5] for c in [4]: . [6] Note that u, v, u_{o} & v_{o} in [6] are the image plane coordinates in the reallife length unit, such as cm. In reality, however, the digitization system may use different length units, such as pixels, and [6] must accommodate this: , [7] where = the unit conversion factors for the U and V axis, respectively. As a result, u, v, u_{o} & v_{o} in [7] can be in any units. Also note the two unit conversion factors in [7] can be different from each other. Now, rearrange [7] for x, y, and z: , [8] where . [9] Coefficients L_{1} to L_{11} in [8] are the DLT parameters that reflect the relationships between the objectspace reference frame and the imageplane reference frame. 3D DLT Method [8] is the standard 3D DLT equation, but one may include in [8] the optical errors from the lens:
where = the optical errors. Optical errors can be expressed as , [11] where, . [12] Among the five additional parameters shown in [11], L_{12}  L_{14} are related to the optical distortion while L_{15} & L_{16} are for the decentering distortion (Walton, 1981):
There are two different ways to use [9] in 3D DLT method: camera calibration and raw coordinate computation. Camera Calibration Rearrange [10] to obtain , [13] where . [14] [13] is equivalent to . [15] Expand [15] for n control points:
In [16], it was assumed that the objectspace coordinates, [x_{i}, y_{i}, z_{i}], were all known. A group of control points whose x, y & z coordinates are already known must be employed for this. The control points must not be coplanar. In other words, the control points must form a volume, the control volume. The control points are typically fixed to a calibration frame or control object. In case less than 16 parameters must be used, discard the unused rows and columns from [16]. Feasible choices are 11, 12, 14, and 16. Note also that the coefficient matrix in [16] requires R_{i}. which is a function of L_{9}  L_{11}. It is impossible to directly solve this system and an iterative approach must be used. L_{9}  L_{11} obtained from the previous iteration can be used in computing R_{i} in the current iteration. [16] is basically in the form of
The DLT parameters can be obtained using the least square method:
See the Least Square Method page for details of the leastsquare approach. To obtain the DLT parameters and the additional parameters using the least square method, [16] must be overdetermined (number of equations > number of unknowns). Since each control point provides 2 equations, the minimum number of control points required are
Reconstruction Rearrange [10] for x, y & z:
where
[19] is equivalent to the matrix expression . [21] Expand [21] for m cameras:
where , [23] and . Again, the least square method described in [17] and [18] can be used in computing the 3D coordinates of the markers on the subject's body. Camera Position and the Principal Point From [9]:
or . [25] Similarly, from [9]:
and . [27] Both [26] and [27] are based on the orthogonality of the transformation matrix T_{I/O}:
See the Transformation Matrix page for details of the orthogonality. Scale Factors and the Transformation Matrix From [9]:
To obtain the transformation matrix in [29], d_{u} and d_{v} must be computed. From [29] and [28]:
It is safe to assume in [30]. However, D can be either positive or negative, as shown in [26]. Use the positive value in [29] first and compute the determinant of the transformation matrix obtained. If the determinant is positive (righthanded system), D must be positive and the current matrix is all right. If the determinant is negative (lefthanded), D must be negative. Multiply 1 to the matrix obtained previously. Three Eulerian angles may be computed from the nine elements of the transformation matrix. See the Eulerian Angles page for details. Note here that the DLT parameters computed using the least square method, [18], does not automatically guarantee an orthogonal transformation matrix due to the experimental errors. This is an intrinsic problem that the DLT method has. The Modified DLT method proposed by Hatze (1988) addressed this problem. See the Modified DLT page for details. 2D DLT Method In the case of 2D analysis, the zcoordinate is always 0 and the mapping from the objectplane reference frame into the imageplane reference frame reduces to
Apply [31] to n (n >= 4) control points and m (m >= 1) cameras:
and
where . [34] The object plane and the image plane do not have to be parallel. 2D DLT guarantees accurate plane toplane mapping regardless of the orientation of the planes. The control points must not be collinear and must form a plane. Calibration Error The accuracy of camera calibration and reconstruction can be assessed by computing the calibration error and/or the reconstruction error. The calibration error of a given camera is defined as . [35] The DLT and additional parameters obtained through the calibration can be applied back to the control points for the computation of their reconstructed coordinates. The reconstruction error is the deviation of the reconstructed coordinates from the measured: , [36] where = reconstructed coordinates of the control point. A selfextrapolation scheme can be employed to improve the reliability of the calibration/reconstruction error (Kwon, 1989). Only half of the control points are used in the computation of the parameters while all points are used in the reconstruction. The reconstruction error computed in this way is more reliable and better reflects the actual objectspace deformation error. The minimum number of control points required for selfextrapolation depends on the no. of parameters:
Experimental Issues There are several important issues:
References & Related Literature AbdelAziz, Y.I., & Karara, H.M. (1971). Direct linear transformation from comparator coordinates into object space coordinates in closerange photogrammetry. Proceedings of the Symposium on CloseRange Photogrammetry (pp. 118). Falls Church, VA: American Society of Photogrammetry. Chen, L. (1985). A selection scheme for nonmetric claserange photogrammetric systems. Unpublished Doctoral Dissertation, University of Illinois, UrbanaChampaign. Chen, L., Armstrong, C.W., & Raftopoulos, D.D., (1994). An investigation on the accuracy of threedimensional space reconstruction using the direct linear transformation technique. J. Biomech 27, 493500. Dapena, J., Harman, E.A., and Miller, J.A. (1982). Threedimensional cinematography with control object of unknown shape. J. Biomech 15, 1119. Hatze, H. (1988). Highprecision threedimensional photogrammetric calibration and object space reconstruction using a modified DLTapproach. J. Biomech 21, 533538. Hinrichs R.N., and McLean, S.P. (1995). NLT and extrapolated DLT: 3D cinematography alternatives for enlarging the volume of calibration. J. Biomech 28, 12191224. Kwon, Y.H. (1994). KWON3D Motion Analysis Package 2.1 User's Reference Manual. Anyang, Korea: VTEK Corporation. Kwon, Y.H. (1989). The effects of different control point conditions on the DLT calibration accuracy. Unpublished class project report, Pennsylvania State University. Marzan, G.T. & Karara, H.M. (1975). A computer program for direct linear transformation solution of the collinearity condition, and some applications of it. Proceedings of the Symposium on CloseRange Photogrammetric Systems (pp. 420476). Falls Church, VA: American Society of Photogrammetry. Miller, N.R., Shapiro, R., & McLaughlin, T.M. (1980). A technique for obtaining spatial kinematic parameters of segments of biomechanical systems from cinematographic data. J. Biomech 13, 535547. Shapiro, R. (1978). Direct linear transformation method for threedimensional cinematography. Res. Quart. 49, 197205. Walton, J.S. (1981). Closerange cinephotogrammetry: a generalized technique for quantifying gross human motion. Unpublished Ph.D. Dissertation, Pennsylvania State University, University Park.

© YoungHoo Kwon, 1998 