Dolly Zoom
COMP 776, Spring 2020 Due: Feb 19, 2020
Summary
The goal of this assignment is for you to apply your knowledge of the pinhole camera model by controlling both the internal and external parameters of a virtual camera in order to simulate the effect of a “dolly zoom.” Skeleton Python code (described below) has been provided for this assignment, along with a dataset that you will use to create synthetic scene renderings. Your task is to complete the code for projecting 3D points into the 2D image, and to manipulate the camera intrinsics/extrinsics to generate the images.
Dataset
The “data.mat” file contains a 3D point cloud (with RGB) from the Strecha dataset, consisting of a front-view of a fountain with a gold fish statue (Fig. 1). We’ve taken the fish statue and moved it down the z-axis, closer to the origin, so that it is floating in the air. The statue will be our foreground object in the assignment, meaning that the fish should remain a constant size in the image as the camera is moved along the z-axis.
The world coordinates are in meters. For reference, the fish statue is approximately 0.4m wide and 0.65m tall, and it is located ~4.1m down the positive z-axis from the world origin.
Figure 1. The 3D point cloud for the dataset. The global coordinate frame is shown on the right, with the y-axis pointing into the page.
Dolly zoom
The dolly zoom (Fig. 2) is an optical effect used by cinematographers. The effect consists in adjusting the distance of the camera to a foreground object in the scene, while simultaneously controlling the camera’s field of view (a function of the focal length), in order for the foreground object to retain a constant size in the image throughout the entire capture sequence.
Figure 2. Illustration of the dolly zoom effect. (Left) A foreground object – the box – is viewed by a camera. The projection of the corner of the box into the sensor is shown by the green dot. (Right) The camera is moved away from the box. At the same time, the focal length is increased so that the box appears to be the same size in the image – i.e., the green dot is the same point on the sensor in both the left and right scenarios. For this assignment, the foreground object will be the floating statue shown above.
Assignment and Code
You have been provided with three Python files:
– main.py: Currently, this file only shows how to load the data from “data.mat” and
perform the rendering. You should modify this file to generate images re-creating the
dolly zoom effect.
– camera.py: This file defines a camera class, which for this assignment will contain
information about both the camera’s intrinsics and extrinsics. Assume a pinhole camera model with separate x/y focal lengths – i.e., no distortion. You will need to fill in the Camera class and implement the following camera function:
o points3D_to_pixel_coordinates: Given a set of 3D points in world space, project these points into the 2D image plane and return the (x, y) pixel coordinates.
– util.py: This file defines two functions, load_data and render. You do not need to make changes to this file, but you should use these functions to load the scene and perform the rendering.
Of course, you are free to make changes to the code as you see fit, as long as you achieve the dolly zoom effect. You should start with the intrinsic calibration matrix K and position the camera such that the statue is between 400 and 600 pixels wide in the rendered image. So that you can verify your code is correct, we suggest you render at least 10-15 images of the dolly zoom sequence with the camera moving a reasonable amount (i.e., several meters).
Submission
– You should submit a single PDF with at least 5 images in the dolly zoom sequence, including the first and last images.
– To quickly verify that the dolly zoom sequence is correct, please also include a second copy of the first image, but with the red channel replaced by the red channel of the last image. If your code is correct, the statue should be approximately the same color, but the difference in the background will be apparent. The code for this visualization is already provided in main.py.
– Put code and report (do not include data.mat) into a folder called “A2_onyen” then zip it with the same name. Submit the zip file to Sakai.