Monday, 25 August 2014

Human Bias - Visual Acuity versus Digital Image Resolution

When it comes to bird identification from digital images I believe there are five key quality parameters to consider, namely:-
RESOLUTION
FOCUS
EXPOSURE
COLOUR
DIGITAL ARTEFACTS

 These properties are all intertwined in many different ways.  I am now approaching the subject from the point of view of fine image detail.

Human's have a very sophisticated visual system.  Vision, it could be said, is our most prominent and acute sensory ability.  Firstly, we have a reasonably acute eye-sight, focused mainly in a very small part of the retina called the Fovea centralis (or fovea). Most of the colour optical receptors of the eye (cones) are located in this small space.  Much like a digital camera, the visual acuity of the fovea is mainly a product of it's large number and density of photoreceptors.  Birds of prey, which have a much greater visual acuity than us, have many times more photoreceptors making up their visual system, somewhat akin to having a camera with more megapixels. 

Unlike most animals, humans observe the world in full colour, thanks to the fact that most of us have three colour cones in our eyes.  Most animals only possess green and blue cones but, thanks to a genetic mutation, the ancestors of humans and related primates developed the ability to see in red in addition to green and blue.  The main evolutionary benefit it seems has been our ability to distinguish ripened fruits from unripened green fruit and foliage, giving our ancestors a competitive advantage over other fruit-foraging species.  

Our green cones outnumber blue and red two to one.  The digital image sensor and formerly colour film both have attempted to mimic the human visual system by attempting to recreate this balance.  The result from a digital imaging perspective is the Bayer Filter.


The image above depicts the workings of a typical digital camera.  The Bayer Filter sits on top of the digital image receptors (photosites).  It works in much the same way as the cone cells of the human eye.  Just as a red cone cell in the eye will only pass red light, the red bayer filter will only allow red light through to the digital receptor.  Each photosite therefore equates to a single pixel of the equivalent bayer filter colour with a record of the  light intensity hitting it.


Demosaicing



As the illustration above depicts, colour digital image formation using a Bayer Filter comes at a cost.  Because the initial "Bayer Raw" image consists of a mosaic of green, blue and red coloured pixels, the image must be processed to form a correctly-coloured digital image.  Called demosaicing, this process consists of an algorithm which interpolates the data from adjacent photosites (two green, a red and a blue) to create the full colour picture.

Interpolation involves averaging values so there is a significant amount of uncertainty brought about by this process.  Some camera manufacturers and raw image editing packages use more complicated algorithms to produce better results.

HERE is a nice blog posting by Adam Hooper, explaining and illustrating the difference between two common types of demosaicing interpolation methods, Bilinear and Adaptive Homogeneity-Directed (AHD).  Basically the bilinear method doesn't take account of the actual image content and simply, blindly averages every pixel.  While, on the other hand a more intuitive algorithm like AHD follows lines and edges between patches of colour and tries to create better definition and less blurring of colour across patches.  But consequently AHD involves more processing, and therefore is slower in creating an image from RAW.

In the ideal world each photosite would work like a mini-spectrophotometer, capable of recording a complete spectral analysis of the light hitting it.  Imagine how big that image file would be, not to mind the sophisticated photosite technology required!


Human Raw Vision


When we start to look at the fine workings of a digital camera and processor it is all too easy to become critical about the loss of data and seemingly heavy processing that is going on.  But, before we get too carried away lets compare what we have just seen with the workings of the human eye and brain.  If we could somehow zoom into the image that our eyes capture we would probably be no less critical.


The fovea is a tiny spot at the back of the retina, directly opposite the pupil of the eye.  It is packed with cone cells for acute colour vision but contains no rod cells (used for low light or night vision).  If you have ever gazed at a galaxy or comet in the night sky you will have noted that it is easier to observe if you focus on a spot slightly to the side of it.  This is because the cone cells in the fovea have relatively poor low light sensitivity.  By shifting the focus to the side of an object of interest the image of the object is projected onto the periphery which is rich in low light sensitive rods.  Suddenly, the object materialises, albeit frustratingly blurry and poorly defined.  When we try and centre our vision on the object, again it appears to vanish as the cones cannot register it's low light.  As kids we all learnt how to find the blind spots in our eyes, where the optic nerve enters the eye.

In an earlier posting HERE I came up with a way to check one's foveal field of view using a neat scintillating pattern I had found online.  It is really amazing just how narrow and tunneled our focus actually is, and it is not too surprising that we often miss something that is literally right under our nose.  

If we think that the heavy processing going on in the camera is unpalatable, consider what the brain has to do to construct a full colour image from the light hitting such a complex arrangement of structures.  Almost every detail we consciously register comes from the cone cells in the fovea.  Our peripheral cones and rods are active by day as part of our peripheral vision. Peripheral vision serves to widen our field of view, alerting us to movement and aiding our spacial awareness, but has little or no active or conscious role until after dark when the rods come into their own as our sole method of vision.


Above I have compared what an image of a small, distant triangle might look like if captured exactly as it appears in life (left) with what a normal modern digital camera records (centre) and what an equivalent human retina might see (right).  The digital camera sensor consists of a regular grid of green, blue and red colour photosites.  The ratio of green is to blue and red is 2:1:1, which is intended to match the distribution of cone cells in the retina.  Unlike the digital sensor, the cone cells in the retina are arranged at random and vary both in size and shape (surface area exposed to the light).  So the digital image starts out not that dissimilar from a "raw" human visual image.

To the brain the triangle edge must have a very odd and ever-changing shape - as image projected on to the back of the eyeball does not remain perfectly stationary (like a photograph) but instead moves about constantly in real time (like a video recording) as our head moves relative to the subject.  The brain must process this real time image and somehow make sense of it.

How much of what we see is real and how much is a construct of the human brain as it tries to fill in gaps?  Using human vision and struggling to make sense of a distant object is not much different from someone trying to make sense of a tiny fuzzy object in a digital image.  Both involve a high degree of uncertainty and there is probably a strong urge to let the brain fill in the missing bits!  On a visit to an optician the Snellen chart quickly remind us of the limitations of our visual acuity.  What we need I think is an equivalent cue for digital image acuity.  With the Image Quality Tool I am advocating Pixel Resolution as one such cue, coupled with Image Focus or Sharpness and an awareness of Image Artefacts.  Together, hopefully these parameters encourage the observer to stop before rushing towards a rash identification.


Acutance 


Acutance is an intriguing concept which again draws parallels between digital imaging and human vision.  If an image appears sharp our brain will happily accept it as being sharp.  Due to demosaicing, digital images start out slightly soft in appearance.  Unsharp Masking is very effective at increasing the acutance or apparent sharpness of photographs but, as these links highlight, the net effect is actually a loss of image data at the pixel level.  When attempting to make sense of small details in images it is best to start with the original raw image if available, not the final, possibly heavily sharpened image.

It is the combination of image resolution and acutance that gives us image sharpness as neatly explained HERE.

The actual mechanism by which acutance works in photo-finishing appear very similar to the natural visual phenomena of of Mach bands and the Cornsweet Illusion.



Moiré 

Moiré is an artefact associated with image resolution.  It can be produced wherever two regularly occurring patterns overlap.  One of these patterns may include the regular distribution of photosites making up the image sensor.  Another may be the repeating pattern of lines making up the computer screen image.  Another may be any regular pattern occurring in the digital image itself.  Lastly, moiré may be produced due to the repeating pattern within an image processing algorithm.  In bird images it occurs most commonly in the repeating pattern of flight feather fringes in the closed wing.  For more see HERE.


The top left image is of high resolution.  The images to the right of it are reduced in resolution to 25% and 12.5% of the original image size respectively.  At full crop there is no obvious difference between these three images on screen.  However when zoomed up at roughly 20% crop the differences are obvious.  I have sharpened the images to enhance the moiré pattern.  The pattern in the 100% and 25% resolution images are much the same, consisting of a slight parallel moiré pattern in the primary and secondary fringes.  However the added pixelation of the 12.5% resolution image adds an additional regular pattern and therefore an extra moiré pattern emerges.  The overall effect is a cross-hatch. 

No comments:

Post a Comment