In the society in which we live, it is interesting to note how technology is increasingly influencing the community in a decisive way, involving fields that are different from each other with the aim of proposing new solutions. In a particular way, recent technological developments are radically changing our “point ofview,” imprinting on the first-person view.
Think, for example, of the revolutionary Google Glass, still in the prototype stage, which opened the door to the best technological inventions of our time. It has become important to be able to identify with seeing the objects themselves through small cameras. The purpose? To film whatever you are looking at and “enter” the object.
Or the cameras attached to sportsmen’s helmets that enable compelling filming, capturing every little emotion of the rider.
We live in an age that wants to defy the odds and go beyond the limits of man.
The right opportunity to experience these kinds of experiments firsthand happened in Rome a few weeks ago at Maker Faire Rome 2013, one of the world’s most important fairs dedicated to innovation and creativity of digital makers. Robots, interactive bikes, 3D printers and living circuits, hundreds of creations from all over the world available to us curious people.
For example, STIGglasses, “digital glasses that can record mental design activities.”
Of all things, what struck us were the many presentations of machines to be able to “record” one’s gaze or movement on a screen.
STIGglasses are used to document mental design processes and record in video format everything one is observing. Intuitive procedure theirs, and although they look a bit retro, they really work.
Gaze Machine: beyond the human gaze
Capturing our utmost attention was the Gaze Machine, literally the “gaze machine,” which is careful to highlight the focal points of our eyes, which is not always good considering the exceptional cases when “it is better not to show where you are looking if there are pretty girls nearby,” as Makis Douskos, one of the designers, puts it.
Intrigued by the machine, we approached it and tried to understand the real functionality, asking some questions to Zoe Fragoulopoulou, the head of the ALCOR (Vision, Perception and Cognitive Robotics Lab) project promoted by the University of Rome La Sapienza.
What was your motivation for starting the Gaze Machine?
The idea arose from the consideration that humans, even many animals, do not observe scenes comprehensively, but, instead, focus their attention to particular regions in their visual field.
This is an important aspect in the study of biological vision and an important source of information in the design of computer vision algorithms. This device allows us to collect on different subjects data on attention in the three-dimensional environment and in various scenarios, where they can move freely in space.
How have you moved to be able to put it into practice?
The design of the Gaze Machine began more than 6 years ago and several prototypes were made during this time. It was a project carried out completely in our laboratory, ALCOR. The main idea was always the same. Two color cameras, covering the subject’s field of view, mounted on a structure that can be carried on the head. In addition, an infrared camera pointing toward the eye to track the movements of the pupil.
What was your main goal?
The goal is to collect data for attention not only in images, but in the three-dimensional environment. Our studies with the Gaze Machine allowed us to observe that there is a big difference in a person’s mental behavior when he or she is working in two or three dimensions (images, screens, etc.). The structure of a three-dimensional scene and the ability to move freely in it also affects the behavior of the subject. The data we can acquire with the Gaze Machine allows us to understand these phenomena in greater depth and also to design more accurate artificial attention models.
What exactly does the machinery consist of?
In addition to the hardware structure described above, the Gaze Machine also includes several software components. First, there are the SW components that manage the cameras and synchronized image acquisition. Synchronization is a key element in order to have consistent data.
Next, a SW component, specially designed for robust recognition and pupil tracking, is used to calculate gaze direction. This, through an initial calibration step and an algorithm based on machine learning, also allows us to have the position of the gaze on the scene images.
Eventually, another SW component, using the scene images, creates a three-dimensional map of the environment and at the same time does Gaze Machine localization in the scene. This allows us to compute the gaze in the three-dimensional map of the ‘environment. In order to make the GM talk as comprehensive as possible, we recommend that you take a look at the official video.
This project is certainly innovative, but will it also be useful?
At this point, beyond attention pattern studies, we also want to explore other application areas for the Gaze Machine. Some ideas are for example:
- Neuropsychological rehabilitation of brain-damaged patients;
- the use of GM by people with disabilities to help them with daily activities but also for activities related to creativity;
- The use of GM in marketing projects, to collect data with respect to people’s behavior in various scenarios;
- The use of GM in augmented reality applications and video games.
In order to be able to explore these but also other areas where GM can be very useful, right now we are organizing a redesign so that it can be disseminated, marketed, and made available to all potentially interested users.
It sounds like a very good prospect to me, especially the connection with the medical branch and the marketing world. Do you have other plans beyond GM?
Yes of course. This is an experimental project so the results are not certain. Our lab is only involved in the first phase of the project, which is “the exploration of tracking saccadic eye-movements whilst looking at media images of violence and trauma from her country,” in collaboration with Sonya Rademeyer, an African artist. Specifically:
ALCOR hopes – by way of this inter-disciplinary exploration between perceptual science and visual art – to create new knowledge in this field. What is particularly promising for such projects is that we are in an era when science needs, more than ever, to communicate its findings in ways that reach across traditional disciplinary boundaries and artists are particularly receptive to the challenges of understanding and interpreting its insights. The result of the project will hopefully initiate many public conversations, which will bring diverse audiences together to discuss the interplay between computer science and art.
Below is our translation of Zoe’s explanation about the new project.
Alcor and African artist Sonya Rademeyer will collaborate on a new project that aims to explore how “empathic impacts” are embodied in perception. Incorporating GM technology, the visual artist, who lives and works in South Africa, will work on exploring and monitoring eye movement as they look at media images of violence and trauma in their own country. After that, the biological pattern of seeing images of violence will be captured, documented, and translated into a new algorithmic model. The new code of procedure will then be to create a new musical score for cello. From October 14 to 18, the artist was a guest of the Alcor Laboratory in order to begin work on the first stage of the project.
ALCOR hopes, through this interdisciplinary exploration between perceptual science and visual art, to create new knowledge in this field. The promise for this project is that we are in an era where science needs, more than other disciplines, to communicate more of what it has found by crossing the boundaries of other disciplines, and artists are particularly receptive to changes in thinking and interpretations of these signals.
The outcome of the project will hopefully initiate many public conversations that will bring diverse audiences together to discuss, creating interaction between computer science and art.
Marketing and the Gaze Machine?
At this point, two eyes are no longer enough. Man, accustomed to having everything, is no longer satisfied and needs something to enhance his capabilities. This is where special glasses, an intelligent tool capable of presenting “augmented reality,” more complete and detailed than the human eye can do, come into play.
As Zoe mentioned, if we could analyze this new approach in depth, we would be able to grasp its real strength: harnessing cutting-edge technology for advertising and commercial purposes. Can this be possible? Sure. Through a “gaze tracking technique” implemented “in a device that is worn on the head, which can detect the gaze of the wearer, and which communicates with a server.” In the Gaze Machine it is possible to see a useful foothold for a hypothetical marketing strategy. It might be interesting to disseminate the machine to large brands to be able to figure out which advertising works and to “pick up” where the human pupil looks.
In this way, one could predict a possible reaction to one’s advertising messages and design a good marketing campaign that would be able to sell one’s products.
Let’s assume that we are the owners of a large brand and want to launch a new product. How can we go about choosing the right message or the most attractive image? Launch several types on the market and have a select group of people wear the glasses. In this way we could analyze where they turn their eyes most…the image that will be recorded most times will be the right one!
This process is not totally new but, although not widespread, it is called “pay per gaze,” capable of calculating the success of an advertisement with a system that can verify how much the user’s eyes have focused on it, supported by “wearable” technology. In today’s times selling a brand through traditional advertising channels is no longer so simple, in fact there is a need for an input that can attract as many users as possible, so a good use of “visual sponsorships” can provide an extra gear and a shortcut.
If technology gives you the ability to facilitate this mechanism…why not use it?