CS 180 Project 1 - Colorizing the Prokudin-Gorskii Photo Collection by Eshani Jha

Overview

The objective of this project was to bring history to life by transforming the digitized Prokudin-Gorskii glass plate images into vivid color photographs, using advanced image processing techniques. These glass plate images, captured over a century ago, each contain three separate exposures representing the red, green, and blue channels of a scene. Our challenge was to seamlessly align and merge these channels to reconstruct a full-color image, minimizing any visible artifacts. Through careful extraction, alignment, and processing of the individual color layers, we aimed to recreate these historical moments with modern clarity, while preserving their original essence.

Using NCC for Single-Scale Color Images

In this project, I tackled the challenge of aligning the red and green channels to the blue channel using a simple x, y translation model. By treating the blue channel as the anchor, I implemented a brute force search over a displacement range of [-15, 15] x [-15, 15], aiming to find the best alignment using Normalized Cross-Correlation (NCC).

The alignment process involved normalizing each image channel and then calculating the dot product between the shifted and reference channels. This allowed me to determine how well they aligned based on their pixel similarities. For every possible shift, I computed the NCC score and tracked the displacement with the highest value, ensuring the most accurate alignment.

I cropped 10% off the borders of each image before running the alignment to limit the distraction from the noisy edges. Additionally, I also used np.roll for circular translations, which kept the image intact by wrapping pixels around instead of introducing zeros during the shift.

Using Coarse-to-Fine Pyramid Speedup

Aligning high-resolution images using brute force methods is inefficient, especially when dealing with larger .tif images where the displacements extend far beyond the small window of [-15, 15]. Searching blindly across a larger space becomes extremely slow. To overcome this, I employed a pyramid search algorithm that drastically speeds up the process by scaling the images down and progressively refining the alignment.

The algorithm works by first downscaling the image by a factor of 2 until it reaches a manageable size, typically around 100-200 pixels. At this reduced resolution, I perform the alignment using a broad search window of [-20, 20]. Once the alignment is calculated at this coarse scale, I upscale both the image and the calculated offset by a factor of 2 and then search within a smaller range of [-2, 2]. This process repeats at increasingly higher resolutions, refining the offset each time until we return to the original image size.

This coarse-to-fine approach ensures the alignment process is both efficient and accurate, even for high-resolution images, without sacrificing precision.

Bells and Whistles

For further refinement of the color channel alignment, I applied a Sobel filter, which is commonly used for edge detection in images. This technique helps to align the sharp transitions in brightness between different parts of the image, making the alignment process more robust, especially in cases where strong edges are present. By first applying the Sobel filter to each color channel (Blue, Green, and Red), I focused on aligning the edges between the channels, ensuring that key image details like borders and contours were matched up more precisely.

After filtering, I used the Pyramid search algorithm to align the Sobel-processed channels, which improved the accuracy of the offsets compared to the non-filtered version. Once I had the Sobel-based alignment, I used those calculated offsets to align the original, unfiltered channels. This approach proved particularly effective for certain images, where subtle improvements in alignment were critical for visual quality.

A significant improvement can be seen in the alignment of "emir.tif." Without Sobel filtering, the offsets for the Green and Red channels were (43, 78) and (-334, 126), respectively. However, after applying the Sobel filter, these offsets were refined to (16, 78) for the Green channel and (33, 142) for the Red channel, leading to a much cleaner final image. This is because the images being matched do not have identical brightness values. This disparity between the channels can complicate the alignment process, since each channel may capture different details and brightness contrasts. The Sobel filter mitigates this issue by focusing on edge information rather than raw pixel intensity, allowing the algorithm to align the images more effectively.

Results

Emir without Sobel

G channel offset: (43, 78)
R channel offset: (-334, 126)

Emir with Sobel

G channel offset with Sobel: (16, 78)
R channel offset with Sobel: (33, 142)