Source Separation by Skip-Connection Based on Transformer Architecture in the Time-Frequency Domain

Lu, Chih HsienChih HsienLuJIAN-JIUN DING2024-02-292024-02-292023-01-019798350314694https://scholars.lib.ntu.edu.tw/handle/123456789/640009There are many end-to-end speech source separation models based on the information of the input audio signal at the time domain. The information on the frequency domain plays an important role in audio processing. In this study, we modified the dual-path transformer network (DPT-Net) with additional information on the time-frequency distribution. To the U-Net, we added the skip connections between the encoder and the decoder acting on the time-frequency distribution. In the experiment, the modification produced a better result than other methods of similar size.dual-path network | skip connection | source separation | time-frequency analysis | transformer[SDGs]SDG7Source Separation by Skip-Connection Based on Transformer Architecture in the Time-Frequency Domainconference paper10.1109/ECICE59523.2023.103830592-s2.0-85184087421https://api.elsevier.com/content/abstract/scopus_id/85184087421