Based on the mechanics of the human brain and its ability to distinguish between different parts of an image, researchers claim that the novel system more accurately represents human vision than anything before.
Potential applications range from robotics, multimedia communication and video surveillance to automated image editing and finding tumours in medical images. The system has been presented in the journal Neurocomputing.
The Multimedia Computing Research Group at Cardiff University is now planning to test the system by helping radiologists to find lesions within medical images, with the overall goal of improving the speed, accuracy and sensitivity of medical diagnostics.
Being able to focus our attention is an important part of the human visual system, which allows humans to select and interpret the most relevant information in a particular scene.
Scientists all over the world have been using computer software to try and recreate this ability to pick out the most salient parts of an image, but with mixed success up until now.
MORE ON ARTIFICIAL INTELLIGENCE
In the study, the team used a deep learning computer algorithm known as a convolutional neural network, which is designed to mimic the interconnected web of neurons in the human brain and is modelled specifically on the visual cortex.
This type of algorithm is ideal for taking images as an input and being able to assign importance to various objects or aspects within the image itself.
According to the team, they utilised a huge database of images in which each image had already been assessed, or viewed, by humans and assigned so-called ‘areas of interest’ using eye-tracking software.
These images were then fed into the algorithm and by using deep learning the system slowly began to learn from the images to a point where it could then accurately predict which parts of the image were most salient.
Researchers said their system was tested against seven advanced visual saliency systems already in use, and was shown to be ‘superior on all metrics’.
“Being able to successfully predict where people look in natural images could unlock a wide range of applications from automatic target detection to robotics, image processing and medical diagnostics,” said Dr Hantao Liu, co-author of the study, from Cardiff University’s School of Computer Science and Informatics.
“Our code has been made freely available so that everyone can benefit from the research and find new ways of applying this technology to real world problems and applications.”
Comment: Autonomous construction requires open data standards
The UK is particularly well served with topographic data thanks to the Environment Agency´s LIDAR programs, specifically the composite digital terrain...