Proc. of the 2017 IEEE International Conference on Signal and Image Processing Applications (IEEE ICSIPA 2017), Malaysia, September 12-14, 2017
Perspective Projection for Decoding of QR Codes Posted on Cylinders
Kuen-Tsair Lay and Ming-Hao Zhou Dept. of Electronic and Computer Engineering National Taiwan Univ. of Science and Technology Taipei, Taiwan, ROC
Email: laykt@mail.ntust.edu.tw
Abstract—Nowadays, QR (quick-response) codes have be- come part of our daily life. In many applications, QR codes are posted (i.e. pasted or printed) on cylinders. Then, the QR image as captured by a camera would be distorted. In this paper, we try to tackle the decoding of QR codes in such a situation. It is based on perspective projection (PP), which is specified by a camera matrix (CM), with the assistance of cross ratio (CR). In the proposed scheme, the mathematics involved is neat, and the computation is fast. Experimental results show that the proposed scheme is effective, in the sense that with the aid of it many failed decoding attempts became successful.
I. INTRODUCTION
Quick-Response (abbreviated as QR) codes have found pervasive applications in our daily life. The standards regarding QR codes are defined in reference item [1]. They are widely used in our everyday life. When a QR code is posted on a plane, the scanned QR image becomes a quadrilateral and can be easily rectified back to the original square shape by a projective transformation, thanks to the fundamental theorem of projective geometry [2], [3]. When it is posted on a sphere, however, the rectification of the associated QR image becomes more complicated. To deal with such a situation, some scheme called conic segmentation is adopted in [4], [5], and some scheme based on perspective projection is adopted in [6].
Similar to [6], our work in this paper adopts perspective projection (PP), which maps between the points of the QR image on the 2D (2-dimensional) plane and the points of the QR code on the 3D (3-dimensional) cylindrical surface. One constraint for the scheme in [6] to work, however, is that the midpoint of the top contour and the midpoint of the bottom contour of the QR image must be available. In practice, this constraint can be a severe problem because the QR image is distorted and there is no easy way to locate those two midpoints (actually in [6] it has not been explicitly described regarding how to obtain those two midpoints). In our work here, we utilize the corner points of the PDPs (position detection patterns), instead. By using the PDP corner positions together with a projective-geometry quantity called cross ratio (CR), the size of the QR code can be identified. Then, by fitting the QR code to a square QR template of this size, the 2D coordinate of any point in the QR code/template can be identified. This 2D coordinate can then be mapped into a 3D coordinate, when the QR template is posted onto a cylinder (probably with some angle of slanting).
The perspective projection is specified by a so-called camera matrix (CM), which is obtained from a set of corresponding feature-point pairs between the QR image
and the QR code on the cylinder. Then, the camera matrix can be applied to map any point (in our application here, the center of each module) on the QR code to some point on the QR image. Th color of this point (either black or white) is then taken as the color of the module. In this way, each module in the QR template can be rendered as either black or white. The completely rendered QR template can then be decoded by any standard QR decoder.
The rest of the paper is organized as follows. In Sec. II, we explain how to find the size of the QR code with the cross ratio. In Sec.III, we describe the process regarding how to find the camera matrix, which is also referred to as the calibration of camera matrix. In Sec.IV, the rectification of QR image via perspective projection, with the calibrated camera matrix, is presented. Then, some experimental results are shown in Sec.V. Finally, Section VI concludes the paper.
II.
FINDING THE SIZE OF QR CODE WITH CROSS RATIO
In this section, we explain how to find the size of the QR code, given its corresponding QR image, with the cross ratio. Cross ratio is a quantity defined in the theory of projective geometry. Let us refer to Fig. 1. There are four pairs of corresponding points on two lines, as captured by a camera (here regarded as a viewing-eye of infinitesimal size).
Fig. 1.
Four pairs of points on two lines, as captured by a camera
The cross ratio of points A, B, C, D is defined as
−→
where AC denotes the distance from point A to point C (counting polarity), and the distances of other pairs of points are similarly defined. The cross ratio of points A’, B’, C’, D’ is defined in exactly the same way, and
−→ −→
(ABCD) = AC · BD , (1) −→ −→
978-1-5090-5559-3/17/$31.00 ©2017 IEEE
39
AD · BC
Proc. of the 2017 IEEE International Conference on Signal and Image Processing Applications (IEEE ICSIPA 2017), Malaysia, September 12-14, 2017
is denoted as (A′B′C′D′), as can be expected . The CR-invariance theorem, which is an important theorem in projective geometry, says that
(ABCD) = (A′B′C′D′) . (2)
Shown in Fig. 2 are a typical QR image (Fig. 2.(a))
and its original QR code (Fig. 2.(b)). Due to the specific
structure of the PDP, the feature points marked as A’,
B’, C’, and D’ in the QR image can be easily located.
Their coordinates are then obtained. Then, of course, the
−−→ −−→ −−→ −−→
distances A′C′, B′D′, A′D′, and B′C′ can be calculated. The value of the cross ratio (A′B′C′D′) is then calculated. For notational brevity, let us denote this value as γ′.
Fig. 3.
Feature-point pairs for CM calibration
Fig. 2.
According to the standard of QR codes [1], the distance
AB and the distance CD are always 7 modules long, as −→ highlighted in Fig. 2(b). Let us assume that the distance BC
is x modules long. Then, of course, AC = BD = 7+x, and −→
AD = 14+x. Plugging those values into Eq. (1) and setting
it to be equal to γ′, we obtain a second-order polynomial equation wherein x is the unknown. Solving this equation for x and then keeping the positive solution only, we have
direction) of the lines that pass through the centers of the PDPs in Fig. 3(a). We had already obtained the QR codes size (i.e. sˆ) from the Sec. II. Now, imagine that we create a QR template of size sˆ × sˆ. Let us pick the center of the template as the origin for an (x, y) coordinate system. The coordinates of the 13 feature points (denoted as (xi,yi)) on the template are then specified (e.g. point 13 would be (−sˆ/2, −sˆ/2), point 2 would be (−sˆ/2 + 7, sˆ/2), and so on). Next imagine that we rotate the QR template by an angle of θ. After the rotation, every feature point now has a new coordinate (denoted as (x′i,yi′)), which is related to the old coordinate(xi, yi) via
[x′ ] [cos(θ) −sin(θ)][x ]
A pair of QR code and QR image, with 4 pairs of feature points
−→ −→
sin(θ)
Next let us try to paste this rotated QR template to an
imaginary cylinder, whose radius is denoted as R. Our goal is to find the 3-D coordinates (X , Y , Z ) (also referred to
√
γ′ x=7×( γ′−1−1).
(3)
as world coordinates) of the 13 feature points. To make this feasible, we actually need the value of R, for which no estimation method has been proposed yet. Fortunately, as verified through our experiments, the value of R is not a critical issue with regard to the eventual rate of decoding success. In practice, therefore, we simply set R to an arbitrary value and then the conversion from the 2- D QR template coordinate (x′i,yi′) to the world coordinate (Xi, Yi, Zi) can be carried out. The conversion formula is
[Xi] R·sin(x′i) Yi=yi′R. (6)
Z x′
i −R·cos( i)
−→ −→
i = yi′
i . (5) cos(θ) yi
iii
The smallest version for QR codes is 1, which corre- sponds to a size of 21×21 modules. Then, each time as the version number increases by 1, each side of the QR code increases by 4 modules. Based on this fact, the knowledge on x can be converted into the (estimated) version number vˆ by
vˆ=round( 4 ), (4) where round(·) denotes the rounding-to-integer operation.
Finally, the estimated size of the QR code is sˆ = 17 + 4 · vˆ. III. CALIBRATION OF CAMERA MATRIX
A perspective projection (PP) is specified with an as- sociated camera matrix (CM). In this paper, we find the camera matrix via the matching of feature-point pairs. We refer to this process as CM calibration. To explain how it is performed, let us refer to Fig. 3, which shows 13 feature- point pairs between the QR image and the QR code on the cylinder.
In the QR image (i.e. Fig. 3(a)), the coordinates (ui , vi ) of the 13 feature points are obtained. In the meantime, the angle of slanting (denoted as θ) of the QR code can be estimated from the angles (with a reference to the horizontal
By adopting the homogeneous coordinates from the projective geometry [2], [7], a perspective projection is expressed in the form of Eq. (7), wherein the 3 × 4 matrix is called the camera matrix.
x−3 R
40
[ u ] [ m11 m12 m13 m14 ] X
c v = m21 m22 m23 m24 Y . (7)
1mmmmZ 31 32 33 34 1
Since a camera matrix can be scaled up or down by any nonzero factor, which is a fact from projective geometry [2], [7], we just arbitrarily set m34 to 1, and therefore there are 11 parameters remaining to be determined. The constant c in Eq. (7) can also be arbitrarily scaled up and down, which is a mathematical fact regarding the homogeneous coordinates. Here let us set c to 1. Then, each pair of (u, v) to (X, Y, Z) correspondence, as described in Eq. (7), can
Proc. of the 2017 IEEE International Conference on Signal and Image Processing Applications (IEEE ICSIPA 2017), Malaysia, September 12-14, 2017
be rearranged into two linear equations in Eq. (8), which consists of n pairs of (u, v) to (X, Y, Z) correspondences.
Recall that there are 13 pairs of corresponding feature points, as shown in Fig. 3. By plugging those 13 pairs
of
we obtain an over-determined system of linear equations, wherein there are 11 unknowns and 26 equations. Those unknown parameters (i.e. mpq) are then obtained as the least squared error (LSE) solution. With those parameters, the camera matrix is completely specified. In other words, the CM calibration is completed.
(ui , vi ) to (Xi , Yi , Zi ) correspondences into Eq. (8),
X Y Z 1 0 0 0 0 −u X −u Y −u Z m11 u
1 1 1 11 11 11 m12 1
0 0 0 0 X1 Y1 Z1 1 −v1X1 −v1Y1 −v1Z1 m13 v1
.. .. .. .. .. .. .. .. .. .. .. .. = .. . (8) ……………… .. ......
XnYnZn10000−unXn−unYn−unZnm32 un
0 0 0 0 Xn Yn Zn 1 −vnXn −vnYn −vnZn m
33
vn
IV. RECTIFICATION BY PERSPECTIVE PROJECTION
expected because of the increase in the number of modules and the decrease in the size of each individual module. Also notice that the failed decodings occur only in the scenario wherein the slanting angles are big, which can be easily avoided in practical applications, simply by adjusting the position and tiling of the scanning camera when the QR image is being taken.
The process of QR image rectification is depicted in Fig. 4. Shown in the bottom-right area is the QR template. For each module of the template, the 2-D coordinate of its center is mapped to a 3-D world coordinate on the cylinder, in exactly the same way as how the feature points are handled in Sec. III. Then, according to Eq. (7) and with the camera matrix found in Sec. III, this 3-D world coordinate is mapped to a point in the QR image. This point, which resides in some pixel, is either black or white. This color is then taken as the color of the corresponding module in the QR template. In other words, the module is rendered according to this color.
Fig. 4. Rendering the QR template, module by module
When this color-rendering process is completed for each module, we have the rectified QR image. Then, any standard QR decoder can be utilized to decode its content.
V. EXPERIMENTAL RESULTS
Simulated QR images that represent QR codes of var- ious versions posted on cylinders with various slanting angles were generated. Two typical sets of such QR images are shown in Fig. 5 (version 7) and Fig. 6 (version 9). When they were directly scanned by standard QR decoders, most decoding results were failures. With the aid of the proposed scheme, however, most decoding attempts became successful (denoted as ✓ in Fig. 5 and Fig. 6).
Comparing Fig. 5 and Fig. 6 we see that the higher- version QR codes are more difficult to decode, which can be
Fig. 5.
Typical experimental results, version 7 QR codes
41
Fig. 6.
Typical experimental results, version 9 QR codes
VI. CONCLUSION
In this paper, we investigate the decoding of QR codes posted on cylinders. The decoding scheme is based on
Proc. of the 2017 IEEE International Conference on Signal and Image Processing Applications (IEEE ICSIPA 2017), Malaysia, September 12-14, 2017
perspective projection (PP) with the assistance of cross ratio (CR). The cross ratio enables us to find the size of the QR code. The camera matrix is calibrated via matching 13 pairs of feature points. Then, the calibrated camera matrix enables us to map any point/cell in the QR template to a point/pixel in the QR image. The cell in the QR template is then rendered according to the color of this QR pixel. The mathematics involved in this PP-based rectification scheme is neat, and the computation is fast. Experimental results show that the proposed scheme is effective, as long as the camera’s slanting angle is not very big.
In this paper, 13 pairs of corresponding feature points are adopted. In the future, the utilization of more feature- point pairs, specially some points from the middle area of the QR code, will be investigated. Moreover, the extended application of the proposed perspective projection scheme from QR codes posted on cylinders to QR codes posted on other types of curved surfaces will also be investigated.
ACKNOWLEDGMENT
This work is partially supported by MOST (Ministry of Science and Technology) of ROC, under Grant No. MOST 104-2221-E-011-068
REFERENCES
[1] “Information technology Automatic identification and data capture techniques Bar code symbology QR Code,” ISO/IEC 18004 International Standard, June 2000.
[2] D. A. Brannan, M. F. Esplen, and J. J. Gray, Geometry, Chapters. 3-4, Cambridge Univ. Press, Cambridge, UK, 1999.
[3] J. A. Lin and C. S. Fuh, “2D Barcode Image Decoding,” Math. Problems in Eng., Vol 2013, Article ID 848276, Hindawi Publishing Corporation, http://dx.doi.org/10.1155/2013/848276
[4] K. T. Lay, L. J. Wang, P. L. Han, and Y. S. Lin, “Rectification of Images of QR Codes Posted on Cylinders by Conic Segmentation”, Proc. 2015 IEEE Int. Conf. on Signal and Image Processing Appli- cations (ICSIPA 2015), Kuala Lumpur, Malaysia, Oct. 2015.
[5] K. T. Lay, Y. J. Chen, H. C. Hsueh, and S. G. Karungaru, “Visually Comprehensible QR Codes via Embedding of Big Logos”, Proc. 2016 IEEE Int. Conf. Signal and Image Processing (ICSIP 2016), pp. 225-232, Aug., 2016.
[6] X. Li, Z. Shi, D. Guo, and S. He, “Reconstruct Algorithm of 2D Barcode for Reading the QR Code on Cylindrical Surface,” Proc. 2013 IEEE Int. Conf. Anti-Counterfeiting, Security and Identification (ASID 2013), pp. 1-5, Oct. 2013.
[7] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis, and Machine Vision, 2nd ed., PWS Publishing, 1999.
42