21 Jul Camera Simulation for Computer Vision
As many of you that follow FiveFocal know, we recently attended CVPR which some consider to be the top computer vision conference in the world. This article will describe our observations on how camera simulation can help improve computer vision algorithm development based on our interactions with people from all aspects of the computer vision field. As a little background our work has historically focused on system level design, meaning that we design full camera systems including the source, lens, sensor, and algorithms to achieve an often ambiguous imaging system goal.
Now onto our observations:
When algorithms are released into the world, they frequency fail in unanticipated ways
Different conditions cause failures, whether it is the light level or lighting non-uniformity, the scene content or scene motion, or that the lens distortion was different than anticipated.
We have found that camera simulation can help you understand under what conditions your algorithm will fail, whether it is the lighting spectrum, the frequency of the object’s texture, or the lens quality, you can parametrically run through the various conditions and identify failures earlier in the development process.
Datasets are invaluable for either algorithm training or testing, but they rarely capture the important differences between different types of cameras that would enable them to work well across different imaging hardware
We observed that algorithm developers frequently use fairly pristine data sets for their training and analysis with simple transforms that expand the dimensionality of the datasets (rotation, mirroring, Gaussian noise), but that there are still performance limitations when the algorithms are used on lower quality camera hardware, under variable lighting conditions, and with platforms that included motion like in handheld and robotic applications.
Camera simulation can be used to add in real noise, accurate spatially, spectrally, and depth varying lens blur, realistic motion blur based on the platform dynamics, and camera differences based on the manufacturer, manufacturing tolerances, sensor type, and the component’s fundamental specifications. Capturing all of these real effects helps improve the computer vision algorithms performance when used with real cameras.
Ground truth is often not available or easy to determine for use cases like depth estimation, navigation, or auto focus
Knowing exactly how accurate an algorithm is based on the number of iterations, steps, illumination level and scene content is very helpful as it allows the developer to make the right tradeoffs for the best overall performance.
Using camera simulation with ground truth means you to know exactly how far away your object is, or how fast it is moving, so you can quantify your system’s and algorithm’s accuracy.
The video at the top of this post walks through some examples that use camera simulation to achieve these objectives:
- Understand where your algorithm with fail and use this information to improve your algorithm robustness
- Generate better datasets based on camera to camera variability and augment existing datasets by changing their lighting conditions, lens quality, or sensor resolution
- Generate images and videos with ground truth for quantitatively assessing algorithm accuracy
If you have other observations about how camera simulation can improve computer vision algorithm development post them below, or email us at firstname.lastname@example.org. For more information about our camera simulation approach, visit us at www.fivefocal.com.