The issue of whether human perception of speech and song recruit integrated or dissociated neural systems is contentious. This issue is difficult to address directly since these stimulus classes differ in their physical attributes. We therefore used a compelling illusion (Deutsch et al. 2011) in which acoustically identical auditory stimuli are perceived as either speech or song. Deutsch's illusion was used in a functional MRI experiment to provide a direct, within-subject investigation of the brain regions involved in the perceptual transformation from speech into song, independent of the physical characteristics of the presented stimuli. An overall differential effect resulting from the perception of song compared with that of speech was revealed in right midposterior superior temporal sulcus / right middle temporal gyrus. A left frontotemporal network, previously implicated in higher-level cognitive analyses of music and speech, was found to co-vary with a behavioural measure of the subjective vividness of the illusion, and this effect was driven by the illusory transformation. These findings provide evidence that illusory song perception is instantiated by a network of brain regions that are predominantly shared with the speech perception network.