Principal component analysis (PCA) is a widely used technique for dimensionality reduction which assumes that the input data can be represented as a collection of fixed-length vectors. Many real-world datasets, such as those constructed from Internet photo collections, do not satisfy this assumption. A natural approach to addressing this problem is to first coerce all input data to a fixed size, and then use standard PCA techniques. This approach is problematic because it either introduces artifacts when we must upsample an image, or loses information when we must downsample an image. We propose two algorithms, one based on estimating the covariance matrix and the other based on EM.
Coming soon.
Coming soon.