July 2017 – Page 5 – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

Generate a Panorama

1. generate a panorama
2. Image Re-projection
3. Homography from a pair of images
4. Computing inliers and outliers
5. Details of constructing panoramas

5 steps to make a panorama
– comupte images
– detection and matching
– warping -> aligning images
– blending, fading, cutting
– cropping (optional)

Align images: Translate
A Bundle of Rays Contains all views
View 1, View 2 -> Synthetic
Possible to generate any synthetic camera view as long as it has the same center of projection

Image Re-Projection
To relate two images from the same camera center and map a pixel from PP1 to PP2
– cast a ray through each pixel in PP1
– Draw the pixel where that ray intersects PP2

Recall: Image warping
traslaion, scale, rotation, affine, perspective

Computing Homography
(x,y), (wx’/w, wy’/w)= (x’,y’)
To compute the homography H, given pairs of corresponding points in two images, we need to set up an equation

Set up a system of linear equation
Ah = b
where vector of unknowns
h = [a,b,c,d,e,f,g,h]^T

Need at least 8 equations, but the more the better
solve for h. if over-constrained, solve using least-equation

Warp into a shared coordinate space

Random sample consensus (RANSAC)
Select one match count INLIERS
Find “average” translation vector

1. select four feature pairs
2. compute homography H(exact)
3. compute inliers where SSD(Pin’, H Pin) < ε

“””Building a (crude) panorama from two images.”””

import numpy as np
import cv2

# Read images
img1 = cv2.imread(“einstein.png”) # left image
img2 = cv2.imread(“davinci.png”) # right image
print “Image 1 size: {}x{}”.format(img1.shape[1], img1.shape[0])
print “Image 2 size: {}x{}”.format(img2.shape[1], img2.shape[0])
cv2.imshow(“Image 1”, img1)
cv2.imshow(“Image 2”, img2)

# Convert to grayscale
img1_gray = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
img2_gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

# Initialize ORB detector object
orb = cv2.ORB() # or cv2.SIFT() in OpenCV 2.4.9+

Image Morphing

1. Image Warping
2. Forward and inverse warping
3. warping using a mesh
4. image morphing
5. Feature-based image morphing

Transformation Lines remain lines
Warping points are mapped to point
A mathematical function for warping from a plane to the plane

Distorted through simulation of optical aberrations
Projected onto a curved or mirrored surface

Consider a S and T image
S has pixel coordinates(u,v)
T has pixel coordinates(x,y)
Forward(x,y)=[X(u,v),Y(u,v)]

Use a sparse set of corresponding points and interpolate with a displacement field
– triangulate the set of points on source
– use the affine model for each triangle
– triangulate target with displaced points
– use inverse mapping

Animations that changes (or morphs) one image or shape into another through a seamless transition
Widely used in movies
Image morphing approaches
Quadrilateral mesh displaced with variational interpolation
Corresponding features/ points
Corresponding orientated line segments (specifies translation, rotation, scaling)

Feature-based morphing

Homogeneous Coordinates

represent coordinates in 2 dimensions with 3-vector
Add a 3rd coordinate to every 2D point
(x, y, w)-> (x/w, y/w)
(x, y, 0) => infinity

p’ = Mp

Projective Transformations
– combination of affine transformations and projective warps

Properties
– origin does not necessarily map to origin
– lines map to lines

import cv2
import numpy as numpy

img = cv2.imread('tech.png')
height, width = img.shape[:2]
cv2.imshow("Original", img)

M_trans = np.float32(
	[[1, 0, 100],
	[0, 1, 50]])
print "Translation matrix:"
print M_trans
img_trans = cv2.warpAffine(img, M_trans, (width, height))
cv2.imshow("translation", img_trans)

Image Transformation

1. Transform the image
2. Rigid Transformations, translation, rotation
3. Affine/projective transformation
4. Degree of freedom for different transformations

image filtering
image warping

g(x)= T(f(x))

change domain of image
g(x)= f(T(x))

Parametric Global Warping
origin(0,0)

rotate Θ

Transformation T
p’ = T(p)

A global and parametric T
– same for any p
– a few numbers(parameters)

As a matrix transform
p’ = Mp

Image Scaling(2D)
– multiply each components by a scalar
– uniform scaling scalar same for x, y
– Non-uniform not same

2D Rotation
(x’, y’), (x, y)
x’ = xcos(Θ)- ysin(Θ)
y’ = xsin(Θ)- ycos(Θ)

Harris Detector: some property

Invariant to Rotation?
Invariance to image intensity?
Invariant to image scale?

Ellipse rotates but its shape(i.g. eigenvalues) remain the same
Corner response R is invariant to image rotation

Mostly invariant to additive and multiplicative intensity changes
only derivatives are used
intensity scale

Invariant to image scale
Not invariant to image scale
but can we do something about this

Consider regions(e.g. circles) of different size around a point
A region which is “scale invariant”
Not affected by the size but will be the same for “corresponding regions”
At a point, compute the scale invariant function over different size neighborhoods
Choose the scale for each image at which the function is a maximum

Sift scale invariant feature transform
specific suggestion use pyramid to find maximum values (remember edge detective) then eliminate “edges” and pick only corners

Harris-Laplacian
Find local maximum of
– harris corner
detector in space (image coordinates)
Laplacian in scale

Feature Detection

1. Harris Corner Detector Algorithm
2. SIFT

Harris Corner Response Function
R > 0, λ1, λ2 large

1.Compute horizontal and vertical derivatives of the image (convolve with derivative of Gaussians)
2.Compute outer products of gradients M
3.Convolve with larger Gaussian
4.Compute scalar interest measure R

Harris Detector Algorithm(Preview)
-compute Gaussian derivatives at each pixel
-compute second moment matrix M in a Gaussian window around each pixel
-compute corner response function R
-Threshold R
-Find local maxima of response function(non-maximum suppression)

"""Haris Cornaer Detection"""

import numpy as np
import cv2

def find_corners(img):
	"""Find corners in an image using Harris corner detection method."""

	img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) if len(img.shape) == 3 else img
	
	h_response = cv2.cornerHarris(img_gray, 2, 3, 0.04)

	h_min, h_max, _, _ = cv2.minMaxLoc(h_response)
	cv2.imshow("Harris response", np.uint8((h_response - h_mimn) * (255.0 / (h_max - h_min))))

	h_thresh = 0.01 * h_max
	_, h_selected = cv2.threshold(h_response, 1, cv2.THRESH_TOZERO)

	nhood_size = 5
	nhood_r = int(nhood_size / 2)
	corners = []
	for y in xrange(h_selected.shape[0]):
		for x in xrange(h_selected.shape[1]):
			h_value = h_selected.item(y, x)
			nhood = h_selected[(y - nhood_r):(y + nhood_r + 1),(x - nhood_r):(x + nhood_r + 1)]
			if not nhood.size:
				continue
			local_max = np.amax(nhood)
			if h_value == local_max:
				corners.append((x, y, h_value))
				h_selected[(y - nhood_r):(y + nhood_r),(x- nhood_r):(x + nhood_r)] = 0
				h_selected.itemset((y,x), h_value)

			h_suppressed = np.uint8((h_selected - h_thresh) * (255.0/ (h_max - h_thresh)))
			cv2.imshow("Suppressed Harris response", h_suppressed)
			return corners

def test():
	"""Test find_corners() with sample imput. """

	# Read image
	img = cv2.imread("octagon.png")
	cv2.imshow("image", img)

	corners = find_corners(img)
	print "\n".join("{}{}".format(corner[0], corner[1]) for corner in corners)

	img_out = img.copy()
	for (x, y, resp) in corners:
		cv2.circle(img_out, (x, y), 1, (0, 0, 255), -1)
		cv2.circle(img_out, (x, y), 5, (0, 255, 0), 1)
	cv2.imshow("Output", img_out)

Features

1. Benefits of Feature Detection and matching in images
2. Characteristics of Good Feature
3. Corners are Good Features
4. Harris Corner Detector Algorithm
5. Stages of a SIFT detector

Image matching
translation
rotation
affine
perspective
scale

x,y,Θ

Finding Features
-goal -find points in an image that can be:
found in other images
found precisely well located
found reliably

Repeatability/precision
Saliency/matchability
compactness and efficiency
locality

Corner Detection: Mathematics
E(u,v)=Σw(x,y)[I(x+y,y+v)-I(x,y)]^2
w(x,y) {box function, a Gaussian

The quadratic approximation, following Taylor Expansion, simplifies to
E(u,v) = [u, v]M[u, v]

Scale Invariant Detectors
-sift(Lowe, 2004)
-Find local maximum of
different of Gaussians in space and scale
DoG in simply a pyramid of the difference of Gaussians within each octave

cuts

1. An additional method for merging images besides blending
2. Finding seams in images
3. Benefits of cutting images over blending images

moving objects cause “ghosting”
find on optimal seam as opposed to blend between images
final has exact pixels from an image

overlapping block -> vertical boundary
take difference between square, min error boundary

Minimum cost cut can be computed in polynomial time(max-flow/min-cut algorighms)

Pyramids

1. Gaussian and the Laplacian pyramids
2. Use of Pyramids to encode the Frequency domain
3. Compute a laplacian pyramid form a Gaussian Pyramid
4. Blend two images using pyramids

avoid seams: Window = size of largest prominent “feature”
avoid ghosting window <= x size of smallest prominent "feature" use Fourier domain largest frequency <= 2 * size of smallest frequency image frequency content should occupy one octave(power of two) Frequency spread needs to be modeled compute FFT(Ij) => Fl, FFT(Ir) => Fr
Decompose Fourier image into octaves(bands)

“Feather” corresponding octaves of Fl Fr
Compute inverse FFT and feather in spatial domain
sum feathered octave images in frequency domain

Pryamid Representation: A Gaussian Pyramid
a = 0.3 – 0.6(.38)
h = wh * wv
gk = h * g(k-1)
gk = REDUCE(g(k-1))

L1 = g1 – EXPAND(g1+1)
A series of “error” images
A difference between two levels of a Gaussian Pyramid

Blending images

merging two images
window size used for merging images

Combine, Merge, Blend images
Cross-Fading Two Images

Factors for Optimal Window Size
to avoid seams window = size of largest prominent “feature”