Harris Detector: some property

Invariant to Rotation?
Invariance to image intensity?
Invariant to image scale?

Ellipse rotates but its shape(i.g. eigenvalues) remain the same
Corner response R is invariant to image rotation

Mostly invariant to additive and multiplicative intensity changes
only derivatives are used
intensity scale

Invariant to image scale
Not invariant to image scale
but can we do something about this

Consider regions(e.g. circles) of different size around a point
A region which is “scale invariant”
Not affected by the size but will be the same for “corresponding regions”
At a point, compute the scale invariant function over different size neighborhoods
Choose the scale for each image at which the function is a maximum

Sift scale invariant feature transform
specific suggestion use pyramid to find maximum values (remember edge detective) then eliminate “edges” and pick only corners

Harris-Laplacian
Find local maximum of
– harris corner
detector in space (image coordinates)
Laplacian in scale

Feature Detection

1. Harris Corner Detector Algorithm
2. SIFT

Harris Corner Response Function
R > 0, λ1, λ2 large

1.Compute horizontal and vertical derivatives of the image (convolve with derivative of Gaussians)
2.Compute outer products of gradients M
3.Convolve with larger Gaussian
4.Compute scalar interest measure R

Harris Detector Algorithm(Preview)
-compute Gaussian derivatives at each pixel
-compute second moment matrix M in a Gaussian window around each pixel
-compute corner response function R
-Threshold R
-Find local maxima of response function(non-maximum suppression)

"""Haris Cornaer Detection"""

import numpy as np
import cv2

def find_corners(img):
	"""Find corners in an image using Harris corner detection method."""

	img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) if len(img.shape) == 3 else img
	
	h_response = cv2.cornerHarris(img_gray, 2, 3, 0.04)

	h_min, h_max, _, _ = cv2.minMaxLoc(h_response)
	cv2.imshow("Harris response", np.uint8((h_response - h_mimn) * (255.0 / (h_max - h_min))))

	h_thresh = 0.01 * h_max
	_, h_selected = cv2.threshold(h_response, 1, cv2.THRESH_TOZERO)

	nhood_size = 5
	nhood_r = int(nhood_size / 2)
	corners = []
	for y in xrange(h_selected.shape[0]):
		for x in xrange(h_selected.shape[1]):
			h_value = h_selected.item(y, x)
			nhood = h_selected[(y - nhood_r):(y + nhood_r + 1),(x - nhood_r):(x + nhood_r + 1)]
			if not nhood.size:
				continue
			local_max = np.amax(nhood)
			if h_value == local_max:
				corners.append((x, y, h_value))
				h_selected[(y - nhood_r):(y + nhood_r),(x- nhood_r):(x + nhood_r)] = 0
				h_selected.itemset((y,x), h_value)

			h_suppressed = np.uint8((h_selected - h_thresh) * (255.0/ (h_max - h_thresh)))
			cv2.imshow("Suppressed Harris response", h_suppressed)
			return corners

def test():
	"""Test find_corners() with sample imput. """

	# Read image
	img = cv2.imread("octagon.png")
	cv2.imshow("image", img)

	corners = find_corners(img)
	print "\n".join("{}{}".format(corner[0], corner[1]) for corner in corners)

	img_out = img.copy()
	for (x, y, resp) in corners:
		cv2.circle(img_out, (x, y), 1, (0, 0, 255), -1)
		cv2.circle(img_out, (x, y), 5, (0, 255, 0), 1)
	cv2.imshow("Output", img_out)

Features

1. Benefits of Feature Detection and matching in images
2. Characteristics of Good Feature
3. Corners are Good Features
4. Harris Corner Detector Algorithm
5. Stages of a SIFT detector

Image matching
translation
rotation
affine
perspective
scale

x,y,Θ

Finding Features
-goal -find points in an image that can be:
found in other images
found precisely well located
found reliably

Repeatability/precision
Saliency/matchability
compactness and efficiency
locality

Corner Detection: Mathematics
E(u,v)=Σw(x,y)[I(x+y,y+v)-I(x,y)]^2
w(x,y) {box function, a Gaussian

The quadratic approximation, following Taylor Expansion, simplifies to
E(u,v) = [u, v]M[u, v]

Scale Invariant Detectors
-sift(Lowe, 2004)
-Find local maximum of
different of Gaussians in space and scale
DoG in simply a pyramid of the difference of Gaussians within each octave

cuts

1. An additional method for merging images besides blending
2. Finding seams in images
3. Benefits of cutting images over blending images

moving objects cause “ghosting”
find on optimal seam as opposed to blend between images
final has exact pixels from an image

overlapping block -> vertical boundary
take difference between square, min error boundary

Minimum cost cut can be computed in polynomial time(max-flow/min-cut algorighms)

Pyramids

1. Gaussian and the Laplacian pyramids
2. Use of Pyramids to encode the Frequency domain
3. Compute a laplacian pyramid form a Gaussian Pyramid
4. Blend two images using pyramids

avoid seams: Window = size of largest prominent “feature”
avoid ghosting window <= x size of smallest prominent "feature" use Fourier domain largest frequency <= 2 * size of smallest frequency image frequency content should occupy one octave(power of two) Frequency spread needs to be modeled compute FFT(Ij) => Fl, FFT(Ir) => Fr
Decompose Fourier image into octaves(bands)

“Feather” corresponding octaves of Fl Fr
Compute inverse FFT and feather in spatial domain
sum feathered octave images in frequency domain

Pryamid Representation: A Gaussian Pyramid
a = 0.3 – 0.6(.38)
h = wh * wv
gk = h * g(k-1)
gk = REDUCE(g(k-1))

L1 = g1 – EXPAND(g1+1)
A series of “error” images
A difference between two levels of a Gaussian Pyramid

Blending images

merging two images
window size used for merging images

Combine, Merge, Blend images
Cross-Fading Two Images

Factors for Optimal Window Size
to avoid seams window = size of largest prominent “feature”

Reconstructing a Signal

Target Signal f^T(t)
f^T(t)=NΣn=1 Acos(nwt)
f1+f2+f3+f4

Periodic function => a weigted sum of sines and cosines of different frequencies
transform f(t) it a F(w)
frequency spectrum of the function f
a reversible operation
For every w from – to ∞(infinity) F(w) holds the amplitude A and phase a sine function

Frequency Domain of a signal
g(t)= sin(2pwt) + 1/3sin(2p(3w)t)

Convolution Theorem and the Fourier Transform
Fourier transform of a convolution of two functions = product of their fourier transforms
F[g * h] = F[g]F[h]
F^-1[gh] = F^-1[g]*F-1[h]

Using the Frequency spectra
low-pass, high-pass, band-pass filtering

Fourier Transform

Using sines and cosines to reconstruct a signal
the Fourier transform
Frequency domain for a signal
Three properties of convolution relating to Fourier Transform

Reconstructing a signal
Repeating impulse function
Target Signal f^T(t)

Basic building block
f(t) = A cos(nwt)
a amplitude, w, frequency

% Cosines of different frequencies
n = 4; % no. of periods (say, seconds)
t = linspace(0, n, n * 90); % time samples
A = 2; % amplitude

f1 = A * cos(1 * (2 * pi) * t);
plot(t, f1);

f2 = A * cos(2 * (2 * pi) * t);
plot(t, f1);

f3 = A * cos(3 * (2 * pi) * t);
plot(t, f1);

f4 = A * cos(4 * (2 * pi) * t);
plot(t, f1);

Sensor

1. photographic processes for digital and film capture
2. Eight layers of color film
3. Five layers of a CCD
4. Differences between a CCD and CMOS Sensor
5. Two benefits of using the Camera Raw Format

Film vs. Digital
There have been significant improvements in actuators, and lenses
Difference is how light is trapped and preserved

Film reaction between light and chemicals
protective layer
UV filter
blue light layer
yellow light layer
green light layer
red light layer
adhesive layer
film base

a sheet of plastic
(polyester, PET, cellulose acetate)

Light -> protective coating, emulsion, adhesive, base, adhesive

Digital: Converting light to data
CCD: Charge-coupled device, a device for converting electrical charge, into a digital value
Pixels are represented by capacitors, which convert and store incoming photo as electron charges
Willard Boyle and George E. Smith, 1969

Digital: converting right to data
micro lens:capture the light and direct it towards light-sensitive areas
Hot mirror: lets visible light pass, but reflects light in the invisible part of the spectrum
Color filter: photodiodes are color bihind. a color filter matrix separates the light into red, green, blue. usually referred to as bayer array
Photodiodes: this is where light energy is converted to electrons, creating a negative charge
Well(depletion layer): where the electrons are collected

Bayer filter on a sensor
blue, two green, red
incoming light -> filter layer -> sensor array
RGB color plane

A 4*4 subset
R G0 G1 B

Actual sensor Information with Bayer Filter

CCD vs CMOS Sensors
CMOS: Complementary metal oxide semiconductor

Camera Raw File Format
-contain minimally processed data from the sensor
-image encoded in a device-dependent color space
-Captures radiometric characteristics of the scene

Exposure

Exposure Triangle
-aperture
-shutter speed
-iso

Focal Length vs. Viewpoint
f=18mm,35mmsensor, f=180mm, 35mm sensor
1st subject 0.5m away, 30m away

Changing focal length allow us to move back, and still capture the scene
Changing viewpoint causes perspective changes

Exposure = irradiance * time
H = E x T

Irradiance(E)
– amount of light falling on a unit area of sensor per second
– Controlled by lens aperture
Exposure Time(T)
– How long the shutter is kept open

Amount of time the sensor is exposed to light
shutter speed
usually denoted in fractions of a second(1/2000,1/1000,1/250,1/60,1/15,15,30,Bulb)

Irradiance on sensor -> the amount of light captured is proportional to the Area of the aperture(opening)
Area = π(f/2N)^2

f is the focal length, what is the diameter of the Aperture?
Aperture number gives irradiance irrespective of the lens in use
f/2.0 on 50mm lens -> aperture = 25mm
f/2.0 on 200mm lens -> aperture = 100mm

f2.8 4, 5.6, 8, 11, 16
More light, less light
area = π(f/2N)^2
Doubling N reduces A by 2C, and therefore reduces light by 4X
from f/2.8 to f/5.6 cuts light by 4X

ISO(sensitivity)
ISO100, ISO1600
Third variable in getting the right exposure
Film sensitivity vs. Grain(of film)

f156
shutter speed:1/10