Why parent object handling matters: The relations among social references, infant attention, and visual saliency



Journal Title

Journal ISSN

Volume Title



The ability to actively select and attend to target items from a visually cluttered environment is essential for effective communication and learning. Attention selection involves two inter-related mechanisms: top-down and bottom-up processing. These have also been referred to as the presence of event-related interests (i.e., social references) and the saliency features of stimuli, both of which account for the direction of eye gaze in various screen-based experimental paradigms. However, little is known about the impacts of parental referential cues and visual saliency in determining an infant’s visual attention in the real world.

The present study primarily focuses on the impacts of parent object handling on infant attention and aims to examine the mechanism underlying an infant's visual selection through observing parent-infant object play. The specific objectives of this present study are to (1) evaluate the relevance of parent object handling to visual saliency in the infant’s visual field; (2) examine the developmental change in parent object handling across visits; (3) investigate the predictability of visual saliency in infant object looking in various referential contexts.

To achieve these goals, the present study recruited 15 parent-infant dyads with infants as young as six months old and made a follow-up visit five months later. By using the head-mounted camera with eye trackers, the infant's visual exploration was captured from an egocentric viewing perspective. Momentary attention allocation, its corresponding saliency estimation, and the co-occurring parental referential input were processed and annotated by each frame of the video data, respectively. The findings of the present study depict the dynamics of visual experiences from the infant’s point of view and strengthen the supportive role of parent object handling in optimizing infant visual selection by providing dynamic multimodal input during interactions.



Infants, Visual attention, Eye tracking, Visual saliency, Multimodal input, Object play