Abstract
This paper presents a novel phase reconstruction method (only from a givenamplitude spectrogram) by combining a signal-processing-based approach and adeep neural network (DNN). To retrieve a time-domain signal from its amplitudespectrogram, the corresponding phase is required. One of the popular phasereconstruction methods is the Griffin-Lim algorithm (GLA), which is based onthe redundancy of the short-time Fourier transform. However, GLA often involvesmany iterations and produces low-quality signals owing to the lack of priorknowledge of the target signal. In order to address these issues, in thisstudy, we propose an architecture which stacks a sub-block including twoGLA-inspired fixed layers and a DNN. The number of stacked sub-blocks isadjustable, and we can trade the performance and computational load based onrequirements of applications. The effectiveness of the proposed method isinvestigated by reconstructing phases from amplitude spectrograms of speeches.