Phoon K.-KChing J.JIAN-YE CHING2022-03-222022-03-22202119908326https://www.scopus.com/inward/record.uri?eid=2-s2.0-85108567084&doi=10.6310%2fjog.202106_16%282%29.2&partnerID=40&md5=3d5cbcc8d9fb5cd5d936a45fe6da48b7https://scholars.lib.ntu.edu.tw/handle/123456789/598502Data-driven site characterization (DDSC) is defined as any site characterization methodology that relies solely on measured data, both site-specific data collected for the current project and existing data of any type collected from past stages of the same project or past projects at the same site, neighboring sites, or beyond. One key complication is that real data is “ugly”. A useful mnemonic is MUSIC-3X (Multivariate, Uncertain and Unique, Sparse, Incomplete, and potentially Corrupted with “3X” denoting three dimensional spatial variations). It is an open question whether DDSC can solve real world subsurface mapping problems based on real world MUSIC-3X data from routine projects with minimum ad-hoc assumptions. The computational challenges are very significant, but some reasonable partial solutions have been obtained recently. One promising solution is Sparse Bayesian Learning (SBL). It is nearly data-driven and it can handle a large scale 3D problem without incurring excessive cost. However, it can only handle one type of field test data. Nonetheless, it is already useful for practice. A 3D SBL version would be made available in Rocscience’s Settle3 (three-dimensional soil settlement analysis) in the near future to generate subsurface maps based on cone penetration test data. The second solution is based on a variant of the Gaussian Process Regression (GPR-MUSIC-3X). It can handle multiple field test data by learning the cross-correlation behavior among different soil parameters at a single site of interest. GPRMUSIC- 3X can be enhanced to learn cross-correlation behaviors at multiple sites and thus bring information from “similar” sites in a larger generic database to bear on improving predictions at a single site. Both 3D SBL and GPR-MUSIC-3X are cross validated using a 2D virtual ground and an actual 3D site in Texas. The hunt is on for a “holy grail” mapping approach that is fully datadriven, MUSIC-3X compliant, and is able to exploit all available data including data from similar sites. This is Project DeepGeo (inspired by DeepMind that produces AlphaGo), which constitutes one major research effort in the emerging field of data-centric geotechnics. ? 2021. All rights reserveddata-centric geotechnicsData-driven site characterization (DDSC)Gaussian process regressionMUSIC-3XSparse Bayesian Learning (SBL)Geophysical prospectingGround penetrating radar systemsComputational challengesCone penetration testsCross correlationsSite characterizationSparse Bayesian learning (SBL)Spatial variationsSub-surface mappingMappingcone penetration testcorrelationdatabaseGaussian methodmappingmodel validationparameterizationpredictionregression analysisthree-dimensional modelingTexasUnited StatesProject Deepgeo ? Data-driven 3D Subsurface Mappingjournal article10.6310/jog.202106_16(2).22-s2.0-85108567084