https://scholars.lib.ntu.edu.tw/handle/123456789/607368
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yang C.-F | en_US |
dc.contributor.author | Fan W.-C | en_US |
dc.contributor.author | Yang F.-E | en_US |
dc.contributor.author | YU-CHIANG WANG | en_US |
dc.date.accessioned | 2022-04-25T06:43:31Z | - |
dc.date.available | 2022-04-25T06:43:31Z | - |
dc.date.issued | 2021 | - |
dc.identifier.issn | 10636919 | - |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85123204731&doi=10.1109%2fCVPR46437.2021.00373&partnerID=40&md5=3a79edced3a256def9f8d66bb169be6e | - |
dc.identifier.uri | https://scholars.lib.ntu.edu.tw/handle/123456789/607368 | - |
dc.description.abstract | When translating text inputs into layouts or images, existing works typically require explicit descriptions of each object in a scene, including their spatial information or the associated relationships. To better exploit the text input, so that implicit objects or relationships can be properly inferred during layout generation, we propose a LayoutTransformer Network (LT-Net) in this paper. Given a scene-graph input, our LT-Net uniquely encodes the semantic features for exploiting their co-occurrences and implicit relationships. This allows one to manipulate conceptually diverse yet plausible layout outputs. Moreover, the decoder of our LT-Net translates the encoded contextual features into bounding boxes with self-supervised relation consistency preserved. By fitting their distributions to Gaussian mixture models, spatially-diverse layouts can be additionally produced by LT-Net. We conduct extensive experiments on the datasets of MS-COCO and Visual Genome, and confirm the effectiveness and plausibility of our LT-Net over recent layout generation models. Codes will be released at LayoutTransformer. ? 2021 IEEE | - |
dc.relation.ispartof | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | - |
dc.subject | Computer vision | - |
dc.subject | Bounding-box | - |
dc.subject | Co-occurrence relationships | - |
dc.subject | Contextual feature | - |
dc.subject | Implicit relationships | - |
dc.subject | Layout generations | - |
dc.subject | Scene-graphs | - |
dc.subject | Semantic features | - |
dc.subject | Spatial diversity | - |
dc.subject | Spatial informations | - |
dc.subject | Text input | - |
dc.subject | Semantics | - |
dc.title | LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity | en_US |
dc.type | conference paper | en |
dc.identifier.doi | 10.1109/CVPR46437.2021.00373 | - |
dc.identifier.scopus | 2-s2.0-85123204731 | - |
dc.relation.pages | 3731-3740 | - |
item.grantfulltext | none | - |
item.openairecristype | http://purl.org/coar/resource_type/c_5794 | - |
item.openairetype | conference paper | - |
item.fulltext | no fulltext | - |
item.cerifentitytype | Publications | - |
crisitem.author.dept | Electrical Engineering | - |
crisitem.author.dept | Communication Engineering | - |
crisitem.author.dept | FinTech Center | - |
crisitem.author.dept | Center for Artificial Intelligence and Advanced Robotics | - |
crisitem.author.orcid | 0000-0002-2333-157X | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
Appears in Collections: | 電機工程學系 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.