Vision is one of the most important sensing modalities in nature because of the valuable, thorough information it can provide about the environment. Vision sensing can come in different flavors ranging from human vision, where images are perspective views that follow the pinhole model, to insect vision where compound eyes with ommatidia design enable the acquisition of multiview images of nearby objects which are highly effective to live and navigate in fast changing 3D environments. Recent technological advances allow mimicking this natural, multiview vision using plenoptic cameras. This thesis approaches plenoptic vision for the case of cameras that combine a single high-definition imaging sensor, a microlens array and a main lens.

The plenoptic camera does not follow the pinhole model that is broadly used in computer vision to describe the projection in conventional cameras that mimic the human eye. The plenoptic camera can be understood as a human eye where the retina is replaced by a compound eye, and where geometric and depth perception aspects deviate from what is taught in classical 3D computer vision. In this thesis is taken the constructive approach of leveraging classical projection models to represent plenoptic cameras as camera arrays that are familiar and intuitive to the average practitioner. State of the art calibration tools for plenoptic cameras are incorporated based on the proposed representation. New functionalities are added such as estimating disparities with differential operators.

The contributions of this work comprise (i) models that describe both standard and multifocus designs of the plenoptic camera in a common framework, (ii) a seminal study that analyzes the depth reconstruction capabilities of the standard plenoptic camera, (iii) new calibration methods that build on the pro posed representation of the plenoptic camera as a camera array to estimate the calibration parameters in a linear, intuitive manner, and (iv) improvements on existing single image reconstruction methods based on intrinsic depth cues and on the concept of affine Lightfield (LF).