top of page
  • Writer's pictureCamille Rondeau Saint-Jean

The foes and floes of labelling aerial pictures manually

Updated: Mar 6

Labelling is the core of the work of biologists at Whale Seeker, but at first glance it seems like a childishly simple task: looking at a photo and pointing out the whales in it, does that really require university degrees and years of experience? You might as well leave out this task for a Captcha system, or to keep bored kids busy during their vacations!


Indeed, humans have a highly developed visual cortex, and from a very young age we can easily distinguish subtle variations of tones and textures, search for shapes in a complex assemblage and correctly interpret images. Thanks to the excellent generalization abilities of the human brain, our eyes are rarely confused by factors such as a change in the background, or by an object being presented in a different position. For example, if you first see a picture of a Dalmatian lying in the grass, you will easily understand that it is the same dog if you then see it standing on a skateboard, but this scenario would probably mislead a computer vision algorithm unless it has been specifically trained for this purpose. Furthermore, a person can focus on the information-rich parts of an image, learn which important details they can rely on, and then explain and discuss their choices.


Then, are humans ideal image labelling machines? No, precisely because they are not machines. Identifying a whale in a good resolution close-up image is easy but extend this task to high-altitude photos of varying quality cluttered with objects of similar shapes and colors, and human capacities will reach their limits. An aerial photo flattens several hundred meters in two dimensions, and an observer who is inattentive or not experienced enough can lose their sense of scale and can even confuse whales with birds in flight!


If you multiply these challenges by hundreds of thousands of images, it will take days or weeks to get through them, fatigue and boredom will set in, and emotional factors will come into play: if the dataset has few whales, there will be a tendency to mark uncertain sightings to avoid the impression of working for nothing. Conversely, if the images have many groups of marine mammals, the observer can unconsciously leave out some less obvious targets. As a result, it is very difficult to maintain a good consistency when labelling large datasets, especially if several observers are working simultaneously.


Möbius was developed by the Whale Seeker team to overcome these shortcomings of manual labelling. An algorithm alone will probably never reach the level of accuracy and reliability of a human observer, for reasons explained in a recent blog post. That is why a human-in-the-loop AI solution is the best way to combine the complementary strengths of humans and computers. The algorithm takes on the repetitive and time-consuming task of analyzing thousands of images, leveraging various detection and image segmentation techniques.  It operates at superhuman speed, without ever getting tired, yielding consistent and standardized results, and then transmits the uncertain and more difficult cases to the flesh and blood expert. The human can then put all their concentration and energy into the cases which will benefit the most from their expertise and judgment. Overall, analyzing images with Möbius is about 80% faster than manual annotation alone!


Following B Corp values, Whale Seeker offers good working conditions to labellers, who are paid for their time, not by the job, a salary that takes into account the cost of living. This makes it possible to provide accurate and reliable data in record time, while adhering to sound ethical principles. Using cutting-edge technology to preserve species and ecosystems, while providing interesting, gratifying work and a good quality of life for humans: that is our vision of good science for now and for the future!


Comments


bottom of page