Rapid urbanization poses social and environmental challenges. Addressing these issues effectively requires access to accurate and up-to-date 3D building models, obtained promptly and cost-effectively. Urban modeling is an interdisciplinary topic among computer vision, graphics, and photogrammetry. The demand for automated interpretation of scene geometry and semantics has surged due to various applications, including autonomous navigation, augmented reality, smart cities, and digital twins. As a result, substantial research effort has been dedicated to urban scene modeling within the computer vision and graphics communities, with a particular focus on photogrammetry, which has coped with urban modeling challenges for decades. This workshop is intended to bring researchers from these communities together. Through invited talks, spotlight presentations, a workshop challenge, and a poster session, it will increase interdisciplinary interaction and collaboration among photogrammetry, computer vision and graphics. We also solicit original contributions in the areas related to urban scene modeling.
May 31, 2024: Building3D Challenge has completed! 🎉
March 15, 2024: S23DR Challenge Competition Begins! 🎉
Feb 15, 2024: Building3D Challenge Competition Begins! 🎉
Feb 8, 2024: USM3D 2024 Call for Papers released! 🎉
Feb 7, 2024: CVPR Workshop list released! 🎉
Feb 3, 2024: We are live! 🎉
As part of this workshop, we are hosting the S23DR challenge. What's next after Structure from Motion? The objective of this competition is to facilitate the development of methods for transforming posed images (sometimes also called "oriented images") / SfM outputs into a structured geometric representation (wire frame) from which semantically meaningful measurements can be extracted. In short: More Structured Structure from Motion.
In conjunction with the challenge, we release a new dataset: the HoHo Dataset. These data were gathered over the course of several years throughout the United States from a variety of smart phone and camera platforms. Each training sample/scene consists of a set of posed image features (segmentation, depth, etc.) and a sparse point cloud as input, and a sparse wire frame (3D embedded graph) with semantically tagged edges as the target. Additionally a mesh with semantically tagged faces is provided for each scene durning training. In order to preserve privacy, original images are not provided.
The winning submission will receive a cash prize provided by the workshop sponsor and the selected finalists will be invited to present their research in the workshop. In order to be eligible for the prizes teams must follow all the rules, provide a write-up detailing their solution in a submission to the workshop, in the form of an extended abstract (4 pages) or a full paper (8 pages), as well as all artifacts code, weights, etc., required to generate a winning submission under CC-BY4.0 license.
There is a $ 25,000 prize pool for this challenge.
Please see the Competition Rules for additional information.
We thank Hover Inc. for their generous sponsorship of this competition
March 14, 2024 | Competition starts |
---|---|
June 4, 2024 | Entry Deadline |
June 4, 2024 | Team Merging |
June 10, 2024 | Competition ends |
June 13, 2024 | Writeup Deadline |
-
As part of this workshop, we are hosting the Building3D challenge. Building3D is an urban-scale publicly available dataset consisting of more than 160 thousand buildings with corresponding point clouds, meshes, and wireframe models covering 16 cities in Estonia. For this challenge, approximately 36, 000 buildings from the city of Tallinn are used as the training and testing dataset. Among them , we selected 6000 relatively dense and structurally simple buildings as the Entry-level dataset. The wireframe model is composed of points and edges representing shape and outline of the object. We require algorithms to take the original point cloud as input and regress the wireframe model. For the evaluation, the metrics of mean precision and recall are employed to evaluate accuracy of both points and edges, and overall offset of the model is calculated. Additionally, the wireframe edit distance (WED) is used as an additional metric to evaluate the accuracy of generated wireframe models.
The winning submission will receive a cash prize provided by the workshop sponsor and the chosen finalists will be invited to present their research in the workshop. The prerequisite to receive a money prize is to provide a write-up detailing their solution by a submission to the workshop, in the form of an extended abstract (4 pages) or a full paper (8 pages), as well as the code required to generate a winning submission under CC BY4.0 license.
We thank PopSmart Inc. for their generous sponsorship of this competition.
Feb 15 2024, Th: Competition starts
May 31 2024, Fri: Competition ends
June 7 2024, Fri: Notification to Participants
June 11 2024, Tu: Writeup Deadline
Iro Armeni is an assistant professor at the Department of Civil and Environmental Engineering, Stanford University, leading the Gradient Spaces group. She is interested in interdisciplinary research spanning Architecture, Civil Engineering, and Machine Perception. Her focus is on developing quantitative and data-driven methods that learn from real-world visual data to generate, predict, and simulate new or renewed built environments that place the human in the center. Her goal is to create sustainable, inclusive, and adaptive built environments that can support our current and future physical and digital needs.
Konrad Schindler received a Ph.D. degree from Graz University of Technology (Austria) in 2003. He was a Photogrammetric Engineer in the private industry and held research positions at Graz University of Technology; Monash University (Melbourne, Australia) and ETH Zürich (Switzerland). He was appointed Assistant Professor of Image Understanding at TU Darmstadt (Germany) in 2009. Since 2010, he has been a tenured Professor of Photogrammetry and Remote Sensing at ETH Zürich. His research interests include remote sensing, photogrammetry, computer vision, and machine learning.
Thomas Funkhouser, the David M. Siegel Professor of Computer Science, Emeritus at Princeton University. Prior to joining Princeton, he was a member of the technical staff at AT&T Bell Laboratories. Funkhouser's research focuses on computer graphics and vision, with interests in interactive graphics, acoustic modeling, multi-user systems, global illumination, and algorithms for managing large-scale 3D data. Funkhouser has contributed to multiple SIGGRAPH conferences and received honors including the ACM SIGGRAPH Computer Graphics Achievement Award (2014), Sloan Foundation Fellowship (1999), and National Science Foundation Career Award (2000).
Noah Snavely is the Professor of Computer Science at Cornell Tech interested in computer vision and computer graphics, and a member of the Cornell Graphics and Vision Group. He also work at Google Research in NYC. His research interests are in computer vision and graphics, in particular in 3D understanding and depiction of scenes from images. He is an ACM Fellow, and the recipient of a PECASE, a Microsoft New Faculty Fellowship, an Alfred P. Sloan Fellowship, and the SIGGRAPH Significant New Researcher Award.
Hao (Richard) Zhang is a Distinguished Professor at Simon Fraser University, and an Amazon Scholar. He earned his Ph.D. from the University of Toronto, and MMath and BMath degrees from the University of Waterloo. His research is in visual computing with special interests in geometric modeling, shape analysis, 3D vision, and geometric deep learning. Awards won by Richard include a Canadian Human-Computer Communications Society Achievement Award in Graphics, a Google Faculty Award, an NSERC Discovery Accelerator Supplement Award, and IEEE Fellowship. He and his students have won the CVPR 2020 Best Student Paper Award and Best Paper Awards at Symposium on Geometry Processing 2008 and CAD/Graphics 2017. He has served as an editor-in-chief for Computer Graphics Forum and will be the Technical Papers Chair for SIGGRAPH 2025.
Yasutaka Furukawa, a principal scientist at Wayve and an associate professor of Computing Science at Simon Fraser University, has held positions as an assistant professor at Washington University in St. Louis, a software engineer at Google, and a post-doctoral research associate at UW, collaborating with Profs. Seitz, Curless, and Rick Szeliski. He earned his Ph.D. under Prof. Ponce at the University of Illinois at Urbana-Champaign. He has received the PAMI Longuet-Higgins prize (2020), CS-CAN: Outstanding Young CS Researcher Award (2018), NSF CAREER Award (2015), Best Student Paper Award at ECCV 2012, and multiple Google Faculty Research Awards (2018, 2017, 2016).
Iro Armeni is an assistant professor at the Department of Civil and Environmental Engineering, Stanford University, leading the Gradient Spaces group. She is interested in interdisciplinary research spanning Architecture, Civil Engineering, and Machine Perception. Her focus is on developing quantitative and data-driven methods that learn from real-world visual data to generate, predict, and simulate new or renewed built environments that place the human in the center. Her goal is to create sustainable, inclusive, and adaptive built environments that can support our current and future physical and digital needs.
Konrad Schindler received a Ph.D. degree from Graz University of Technology (Austria) in 2003. He was a Photogrammetric Engineer in the private industry and held research positions at Graz University of Technology; Monash University (Melbourne, Australia) and ETH Zürich (Switzerland). He was appointed Assistant Professor of Image Understanding at TU Darmstadt (Germany) in 2009. Since 2010, he has been a tenured Professor of Photogrammetry and Remote Sensing at ETH Zürich. His research interests include remote sensing, photogrammetry, computer vision, and machine learning.
Yasutaka Furukawa, a principal scientist at Wayve and an associate professor of Computing Science at Simon Fraser University, has held positions as an assistant professor at Washington University in St. Louis, a software engineer at Google, and a post-doctoral research associate at UW, collaborating with Profs. Seitz, Curless, and Rick Szeliski. He earned his Ph.D. under Prof. Ponce at the University of Illinois at Urbana-Champaign. He has received the PAMI Longuet-Higgins prize (2020), CS-CAN: Outstanding Young CS Researcher Award (2018), NSF CAREER Award (2015), Best Student Paper Award at ECCV 2012, and multiple Google Faculty Research Awards (2018, 2017, 2016).
Hao (Richard) Zhang is a Distinguished Professor at Simon Fraser University, and an Amazon Scholar. He earned his Ph.D. from the University of Toronto, and MMath and BMath degrees from the University of Waterloo. His research is in visual computing with special interests in geometric modeling, shape analysis, 3D vision, and geometric deep learning. Awards won by Richard include a Canadian Human-Computer Communications Society Achievement Award in Graphics, a Google Faculty Award, an NSERC Discovery Accelerator Supplement Award, and IEEE Fellowship. He and his students have won the CVPR 2020 Best Student Paper Award and Best Paper Awards at Symposium on Geometry Processing 2008 and CAD/Graphics 2017. He has served as an editor-in-chief for Computer Graphics Forum and will be the Technical Papers Chair for SIGGRAPH 2025.
Noah Snavely is the Professor of Computer Science at Cornell Tech interested in computer vision and computer graphics, and a member of the Cornell Graphics and Vision Group. He also work at Google Research in NYC. His research interests are in computer vision and graphics, in particular in 3D understanding and depiction of scenes from images. He is an ACM Fellow, and the recipient of a PECASE, a Microsoft New Faculty Fellowship, an Alfred P. Sloan Fellowship, and the SIGGRAPH Significant New Researcher Award.
Thomas Funkhouser, the David M. Siegel Professor of Computer Science, Emeritus at Princeton University, received his B.S. from Stanford University, M.S. from UCLA, and Ph.D. from UC Berkeley. Prior to joining Princeton, he was a member of the technical staff at AT&T Bell Laboratories. Funkhouser's research focuses on computer graphics and computer vision, with interests in interactive graphics, acoustic modeling, multi-user systems, global illumination, and algorithms for managing large-scale 3D data. He was a principal developer of the UC Berkeley Architectural Walkthrough System, which achieved real-time visualization of building models with millions of polygons. Funkhouser has contributed to multiple SIGGRAPH conferences and received honors including the ACM SIGGRAPH Computer Graphics Achievement Award (2014), Sloan Foundation Fellowship (1999), and National Science Foundation Career Award (2000).
Name | Affiliation |
---|---|
Daniel G. Aliaga | Purdue University |
Daniela Cabiddu | CNR - Istituto di Matematica Applicata e Tecnologie Informatiche (IMATI) |
Yiping Chen | Sun Yat-Sen University |
Ian Endres | Hover Inc. |
Hongchao Fan | Norwegian University of Science and Technology |
Antoine Guédon | École Des Ponts Paristech |
Liu He | Purdue University |
Qingyong Hu | National University of Defense Technology |
Yuzhong Huang | Hover Inc. / USC Information Sciences Institute |
Loic Landrieu | Ecole des Ponts ParisTech |
David Marx | CoreWeave, EleutherAI |
Philippos Mordohai | Stevens Institute of Technology |
Bisheng Yang | Wuhan University |
Name | Affiliation |
---|---|
Liangliang Nan | Delft University of Technology |
Rongjun Qin | The Ohio State University |
Ziming Qui | NYU / Lowe's |
Gunho Sohn | York University |
Ioannis Stamos | City University of New York |
Gábor Sörös | Bell Labs |
George Vosselman | University of Twente |
Jun Wang | Nanjing University of Aeronautics and Astronautics |
Chenglu Wen | Xiamen University |
Michael Ying Yang | University of Bath |
Bo Yang | The Hong Kong Polytechnic University |
Wei Yao | The Hong Kong Polytechnic University |
Saurabh Prasad | University of Houston |
Jiju Poovvancheri | Saint Mary’s University |