What is Predict attention AI?

This article explains how our attention AI behind Predict generates eye-tracking results with over 95% accuracy.

Predict is an AI tool that is based on our world-leading consumer neuroscience database. With data based on real consumer responses, we are producing highly accurate predictive models of attention, and cognition.

To put it simply, Predict AI has seen enough examples of where consumers are looking in enough scenarios that it has learned to predict the distribution of attention on a given image or video frame.

This means that you can now upload images and videos to Predict and receive results almost identical to if you were to run an eye-tracking study with ca. 100-150 participants using robust equipment. Predict generates results in a tiny fraction of the time a real study would take.

Accuracy of Predict AI

The machine learning models were built on one of the world’s largest single databases of high-quality eye-tracking data. The winning model predicts eye-tracking results with over 95% accuracy.



Three main factors have enabled us to build an AI that achieves this:

  1. Quality eye-tracking
  2. Extensive database
  3. Machine learning


Using eye-tracking technology such as the Tobii Pro Glasses 2 and Pro Nano, we have been able to accurately track where a person's eyes are looking and measure it up to 60 times per second. Each data point (called gaze point) contains eye-movement information such as X/Y coordinates, movement speed, and direction. These data points can be grouped into “fixations,” which determine when a person has actually seen what their eyes are pointing towards.

These fixations can be visualized as heatmaps, as we know them from eye-tracking research - and these heatmaps are what Predict attention AI has been trained to infer from creative assets.

Our in-house eye-tracking lab studies employ 30-35 participants, which is the gold standard for lab research. Furthermore, we have carried out a global eye-tracking study across different countries where 180 participants were tested. From statistical analyses, we can prove that attention heatmaps based on eye-tracking data from 30-35 participants matches with high fidelity the attention heatmaps obtained from 100-150 participants, so we are confident in the representative nature of our typical sample size.

Neurons database

The AI is built on consumer eye-tracking data from Neurons’ research studies. The database is one of the world’s largest single databases of high-quality eye-tracking data, with well over 20.000 participants from around the globe exposed to consumer-related stimuli, such as watching ads on social media or TV, reading physical newspapers, or shopping in a store.

When designing and implementing the AI framework, we wanted to ensure that a wide range of stimuli was rightfully represented, in order not to introduce any biases in the AI’s capabilities. We identified a wide range of industries (e.g., food & beverage, financial services, construction, telecom, etc.) and different formats (e.g., print ads, commercials, web pages, e-commerce, products, packaging, retail, and apps) that consumers can be exposed to. We initiated further research studies to collect data for any industries or formats that were lacking in our database.

This amasses an incredibly extensive, high-quality, properly-labeled eye-tracking database.

Machine learning

Using our database, we generated attention heatmaps from the eye-tracking data, and almost 200 distinct machine learning models were trained and compared to produce the best possible model prediction. For each model training, one portion of the data was randomly selected for training and a second portion for validation test.


With enough examples of where consumers look, we have successfully been able to create an AI which has learnt to predict visual attention with high fidelity.

Through extensive testing and validation, uploading an image to Predict now gives you a heatmap that is statistically equivalent to a heatmap built on actual eye-tracking data from showing 100-150 people the image for 5 seconds with over 95% accuracy - and it only takes a few seconds.