As social robots move from research laboratories into everyday settings, they increasingly encounter users whose sensory expectations are shaped by different cultural worlds. This concept paper proposes a Five-Senses Framework for assessing nonverbal communication in multicultural human–robot interaction (HRI). Drawing on sensory anthropology and embodied cognition, we treat sensory perception as culturally mediated: what people see, hear, feel, smell, and taste in robot encounters is socially learned. We combine Urakami & Seaborn’s five-senses taxonomy [1] with sensory anthropology to produce an assessment framework: for each sensory channel, we identify what the robot expresses, what users are likely to perceive, and how cultural norms shape the interpretation of that perception. We discuss the framework in the context of multilingual South Africa, where everyday interactions routinely cross linguistic and cultural boundaries, and where expectations about gaze, personal space, timing, and touch often differ in subtle but meaningful ways – making it the ideal environment for assessing nonverbal behaviour in multicultural HRI. We argue that the framework offers a practical way to detect culturally mediated sensory misalignment – when a robot’s nonverbal cues are perceived or interpreted differently than intended – and to address it early, before it undermines trust and engagement and impairs interaction.