2 Fundamentals Preview
As mentioned in the previous chapter, the power that MATLAB brings to digital image processing is an extensive set of functions for processing mul- tidimensional arrays of which images (two-dimensional numerical arrays) are a special case. The Image Processing Toolbox is a collection of functions that extend the capability of the MATLAB numeric computing environment. These functions, and the expressiveness of the MATLAB language, make image-processing operations easy to write in a compact, clear manner, thus providing an ideal software prototyping environment for the solution of image processing problems. In this chapter we introduce the basics of MATLAB notation, discuss a number of fundamental toolbox properties and functions, and begin a discussion of programming concepts. Thus, the material in this chapter is the foundation for most of the software-related discussions in the remainder of the book.
2.1 Digital Image Representation
An image may be defined as a two-dimensional function f ( x, y), where x and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordi- nates is called the intensity of the image at that point.The term gray level is used often to refer to the intensity of monochrome images. Color images are formed by a combination of individual images. For example, in the RGB color system a color image consists of three individual monochrome images, referred to as the red (R), green (G), and blue (B) primary (or component) images. For this reason, many of the techniques developed for monochrome images can be ex- tended to color images by processing the three component images individually. Color image processing is the topic of Chapter 7. An image may be continuous
13
14 Chapter 2 Fundamentals
with respect to the x- and y-coordinates, and also in amplitude. Converting such an image to digital form requires that the coordinates, as well as the amplitude, be digitized. Digitizing the coordinate values is called sampling; digitizing the amplitude values is called quantization. Thus, when x, y, and the amplitude val- ues of f are all finite, discrete quantities, we call the image a digital image.
2.1.1 CoordinateConventions
The result of sampling and quantization is a matrix of real numbers. We use two principal ways in this book to represent digital images. Assume that an image f (x, y) is sampled so that the resulting image has M rows and N columns. We say that the image is of size M N . The values of the coordinates are discrete quantities. For notational clarity and convenience, we use integer values for these discrete coordinates. In many image processing books, the image origin is defined to be at (x, y) (0, 0). The next coordinate values along the first row of the image are (x, y) (0,1).The notation (0,1) is used to signify the second sample along the first row. It does not mean that these are the actual values of physical coordinates when the image was sampled. Figure 2.1(a) shows this coordinate convention. Note that x ranges from 0 to M 1 and y from 0 to N 1 in integer increments.
The coordinate convention used in the Image Processing Toolbox to denote arrays is different from the preceding paragraph in two minor ways. First, in- stead of using (x, y), the toolbox uses the notation (r, c) to indicate rows and columns. Note, however, that the order of coordinates is the same as the order discussed in the previous paragraph, in the sense that the first element of a coordinate tuple, (a, b), refers to a row and the second to a column. The other difference is that the origin of the coordinate system is at (r, c) (1, 1); thus, r ranges from 1 to M, and c from 1 to N, in integer increments. Figure 2.1(b) il- lustrates this coordinate convention.
Image Processing Toolbox documentation refers to the coordinates in Fig. 2.1(b) as pixel coordinates. Less frequently, the toolbox also employs another coordinate convention, called spatial coordinates, that uses x to refer to columns and y to refers to rows. This is the opposite of our use of variables x and y. With
0 1 2…. ….N 1 y 01
1 2 3….
Origin
One pixel
….N c
1 Origin 2 23
. . . .
. .
. . M 1 M
One pixel
xr
ab
FIGURE 2.1
Coordinate conventions used (a) in many image processing books, and (b) in the Image Processing Toolbox.
a few exceptions, we do not use the toolbox’s spatial coordinate convention in this book, but many MATLAB functions do, and you will definitely encounter it in toolbox and MATLAB documentation.
2.1.2 ImagesasMatrices
The coordinate system in Fig. 2.1(a) and the preceding discussion lead to the
following representation for a digitized image:
f(0,0) f(0,1) f(0,N 1)
f(1,0) f(1,1) f(1,N 1) f(x,y)
f(M 1,0) f(M 1,1) f(M 1,N 1)
The right side of this equation is a digital image by definition. Each element of this array is called an image element, picture element, pixel, or pel. The terms image and pixel are used throughout the rest of our discussions to denote a digital image and its elements.
A digital image can be represented as a MATLAB matrix:
MATLAB documentation uses
the terms matrix and array interchangeably. However, keep in mind that a matrix is two dimensional, whereas an array can have any finite dimension.
f(1, 1) f(1, 2) f(1, N)
f(2, 1) f(2, 2) f(2, N) f
f(M, 1) f(M, 2) f(M, N)
where f(1, 1) f(0,0) (note the use of a monospace font to denote MAT- LAB quantities). Clearly, the two representations are identical, except for the shift in origin. The notation f(p, q) denotes the element located in row p and column q. For example, f(6, 2) is the element in the sixth row and second column of matrix f. Typically, we use the letters M and N, respectively, to denote the number of rows and columns in a matrix. A 1 N matrix is called a row vec- tor, whereas an M 1 matrix is called a column vector. A 1 1 matrix is a scalar.
Matrices in MATLAB are stored in variables with names such as A, a, RGB, real_array, and so on. Variables must begin with a letter and contain only letters, numerals, and underscores. As noted in the previous paragraph, all MATLAB quantities in this book are written using monospace characters. We use conventional Roman, italic notation, such as f(x, y), for mathematical ex- pressions.
2.2 Reading Images
Images are read into the MATLAB environment using function imread, whose basic syntax is
imread(‘filename’)
Recall from Section 1.6 that we use margin icons to highlight the first
use of a MATLAB or toolbox function.
imread
2.2 Images as Matrices 15
16 Chapter 2 Fundamentals
Here, filename is a string containing the complete name of the image file (in-
semicolon(;)
prompt(>>)
In Windows, directories are called folders.
cluding any applicable extension). For example, the statement
>> f = imread(‘chestxray.jpg’);
reads the image from the JPEG file chestxray into image array f. Note the use of single quotes (‘) to delimit the string filename. The semicolon at the end of a statement is used by MATLAB for suppressing output. If a semicolon is not included, MATLAB displays on the screen the results of the operation(s) specified in that line. The prompt symbol (>>) designates the beginning of a command line, as it appears in the MATLAB Command Window (see Fig. 1.1).
When, as in the preceding command line, no path information is included in filename, imread reads the file from the Current Directory and, if that fails, it tries to find the file in the MATLAB search path (see Section 1.7). The simplest way to read an image from a specified directory is to include a full or relative path to that directory in filename. For example,
>> f = imread(‘D:\myimages\chestxray.jpg’);
reads the image from a directory called myimages in the D: drive, whereas >> f = imread(‘.\myimages\chestxray.jpg’);
reads the image from the myimages subdirectory of the current working direc- tory. The MATLAB Desktop displays the path to the Current Directory on the toolbar, which provides an easy way to change it. Table 2.1 lists some of the most popular image/graphics formats supported by imread and imwrite (imwrite is discussed in Section 2.4).
Typing size at the prompt gives the row and column dimensions of an image:
>> size(f)
ans =
1024 1024
More generally, for an array A having an arbitrary number of dimensions, a
statement of the form
[D1, D2,…, DK] = size(A)
returns the sizes of the first K dimensions of A. This function is particularly use-
ful in programming to determine automatically the size of a 2-D image:
>> [M, N] = size(f);
This syntax returns the number of rows (M) and columns (N) in the image. Simi- larly, the command
size
† Supported by imread, but not by imwrite
>> M = size(f, 1);
gives the size of f along its first dimension, which is defined by MATLAB as the vertical dimension. That is, this command gives the number of rows in f. The second dimension of an array is in the horizontal direction, so the state- ment size(f, 2) gives the number of columns in f. A singleton dimension is any dimension, dim, for which size(A, dim) = 1.
The whos function displays additional information about an array. For instance, the statement
2.2 Reading Images 17
TABLE 2.1
Some of the image/graphics formats support- ed by imread and imwrite, starting with MATLAB 7.6. Earlier versions support
a subset of these formats. See the MATLAB docu- mentation for a complete list of supported formats.
Format Name
Description
Recognized Extensions
BMP† CUR FITS† GIF HDF ICO† JPEG JPEG
PBM PGM PNG PNM RAS TIFF XWD
2000†
Windows Bitmap
Windows Cursor Resources Flexible Image Transport System Graphics Interchange Format Hierarchical Data Format Windows Icon Resources
Joint Photographic Experts Group Joint Photographic Experts Group
Portable Bitmap
Portable Graymap Portable Network Graphics Portable Any Map
Sun Raster
Tagged Image File Format X Window Dump
.bmp .cur
.fts, .fits .gif
.hdf
.ico
.jpg, .jpeg
.jp2, .jpf, .jpx, j2c, j2k
.pbm
.pgm
.png
.pnm
.ras
.tif, .tiff .xwd
>> whos f
gives
Name f
Size
1024×1024
Bytes
1048576
Class uint8
Attributes
whos
Although not applicable in this example, attributes that might appear under Attributes include terms such as global, complex, and sparse.
The Workspace Browser in the MATLAB Desktop displays similar informa- tion. The uint8 entry shown refers to one of several MATLAB data classes discussed in Section 2.5. A semicolon at the end of a whos line has no effect, so normally one is not used.
18 Chapter 2 Fundamentals
2.3 Displaying Images
imshow
Function imshow has a number of other syntax forms for performing tasks such as controlling image magnification. Consult the help page for imshow for additional details.
EXAMPLE 2.1:
Reading and displaying images.
Images are displayed on the MATLAB desktop using function imshow, which has the basic syntax:
imshow(f)
where f is an image array. Using the syntax imshow(f, [low high])
displays as black all values less than or equal to low, and as white all values greater than or equal to high. The values in between are displayed as interme- diate intensity values. Finally, the syntax
imshow(f, [ ])
sets variable low to the minimum value of array f and high to its maximum value. This form of imshow is useful for displaying images that have a low dynamic range or that have positive and negative values.
The following statements read from disk an image called rose_512.tif, extract information about the image, and display it using imshow:
FIGURE 2.2
Screen capture showing how an image appears
on the MATLAB desktop. Note the figure number on the top, left of the window. In most of the examples throughout the book, only the images themselves are shown.
>> f = imread(‘rose_512.tif’);
>> whos f
Name Size Bytes
f 512×512 262144
>> imshow(f)
Class
uint8 array
Attributes
A semicolon at the end of an imshow line has no effect, so normally one is not used. Figure 2.2 shows what the output looks like on the screen. The figure