LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity

Yang C.-F; Fan W.-C; Yang F.-E; YU-CHIANG WANG

DC Field	Value	Language
dc.contributor.author	Yang C.-F	en_US
dc.contributor.author	Fan W.-C	en_US
dc.contributor.author	Yang F.-E	en_US
dc.contributor.author	YU-CHIANG WANG	en_US
dc.date.accessioned	2022-04-25T06:43:31Z	-
dc.date.available	2022-04-25T06:43:31Z	-
dc.date.issued	2021	-
dc.identifier.issn	10636919	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85123204731&doi=10.1109%2fCVPR46437.2021.00373&partnerID=40&md5=3a79edced3a256def9f8d66bb169be6e	-
dc.identifier.uri	https://scholars.lib.ntu.edu.tw/handle/123456789/607368	-
dc.description.abstract	When translating text inputs into layouts or images, existing works typically require explicit descriptions of each object in a scene, including their spatial information or the associated relationships. To better exploit the text input, so that implicit objects or relationships can be properly inferred during layout generation, we propose a LayoutTransformer Network (LT-Net) in this paper. Given a scene-graph input, our LT-Net uniquely encodes the semantic features for exploiting their co-occurrences and implicit relationships. This allows one to manipulate conceptually diverse yet plausible layout outputs. Moreover, the decoder of our LT-Net translates the encoded contextual features into bounding boxes with self-supervised relation consistency preserved. By fitting their distributions to Gaussian mixture models, spatially-diverse layouts can be additionally produced by LT-Net. We conduct extensive experiments on the datasets of MS-COCO and Visual Genome, and confirm the effectiveness and plausibility of our LT-Net over recent layout generation models. Codes will be released at LayoutTransformer. ? 2021 IEEE	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.subject	Computer vision	-
dc.subject	Bounding-box	-
dc.subject	Co-occurrence relationships	-
dc.subject	Contextual feature	-
dc.subject	Implicit relationships	-
dc.subject	Layout generations	-
dc.subject	Scene-graphs	-
dc.subject	Semantic features	-
dc.subject	Spatial diversity	-
dc.subject	Spatial informations	-
dc.subject	Text input	-
dc.subject	Semantics	-
dc.title	LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity	en_US
dc.type	conference paper	en
dc.identifier.doi	10.1109/CVPR46437.2021.00373	-
dc.identifier.scopus	2-s2.0-85123204731	-
dc.relation.pages	3731-3740	-
item.grantfulltext	none	-
item.openairecristype	http://purl.org/coar/resource_type/c_5794	-
item.openairetype	conference paper	-
item.fulltext	no fulltext	-
item.cerifentitytype	Publications	-
crisitem.author.dept	Electrical Engineering	-
crisitem.author.dept	Communication Engineering	-
crisitem.author.dept	FinTech Center	-
crisitem.author.dept	Center for Artificial Intelligence and Advanced Robotics	-
crisitem.author.orcid	0000-0002-2333-157X	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	Others: University-Level Research Centers	-
crisitem.author.parentorg	Others: University-Level Research Centers	-
Appears in Collections:	電機工程學系

Show simple item record

Page view(s)

checked on May 25, 2024

Google Scholar^TM

Check

DSpace CRIS

Page view(s)

Google Scholar^TM

Altmetric

Altmetric

Page view(s)

Google ScholarTM

Altmetric

Altmetric

Google Scholar^TM