Options
Learning to encode text as human-readable summaries using generative adversarial networks
Journal
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Pages
4187-4195
Date Issued
2020
Author(s)
Wang, Y.-S.
Abstract
Auto-encoders compress input data into a latent-space representation and reconstruct the original data from the representation. This latent representation is not easily interpreted by humans. In this paper, we propose training an auto-encoder that encodes input text into human-readable sentences, and unpaired abstractive summarization is thereby achieved. The auto-encoder is composed of a generator and a reconstructor. The generator encodes the input text into a shorter word sequence, and the reconstructor recovers the generator input from the generator output. To make the generator output human-readable, a discriminator restricts the output of the generator to resemble human-written sentences. By taking the generator output as the summary of the input text, abstractive summarization is achieved without document-summary pairs as training data. Promising results are shown on both English and Chinese corpora. © 2018 Association for Computational Linguistics
Other Subjects
Encoding (symbols); Learning systems; Network coding; Adversarial networks; Auto encoders; Chinese corpus; Human-readable; Input datas; Training data; Natural language processing systems
Type
conference paper