Applica'on of UQ Principles to Calibra'on, Sensi'vity, and Experimental Design

Similar documents
Status Report for Standard Model WH(bb) Analysis

Support Vector Machines

DETERMINING CAUSALITY IN OBESITY

Probabilistic earthquake early warning in complex earth models using prior sampling

Does Decentralization Lessen or Worsen Poverty? Evidence from

Measuring Bias and Uncertainty in Ideal Point Estimates via the Parametric Bootstrap

SIMPLE LINEAR REGRESSION OF CPS DATA

CSCI 325: Distributed Systems. Objec?ves. Professor Sprenkle. Course overview Overview of distributed systems Introduc?on to reading research papers

Adaptive QoS Control for Real-Time Systems

Decentralised solutions for renewable energies and water in developing countries

Introduction to Path Analysis: Multivariate Regression

Interpre'ng our Results & Condi'onal Effects. Andrea Ruggeri WK 2 Q Step, Year 2

Transnational Dimensions of Civil War

Tengyu Ma Facebook AI Research. Based on joint work with Yuanzhi Li (Princeton) and Hongyang Zhang (Stanford)

LPGPU. Low- Power Parallel Compu1ng on GPUs. Ben Juurlink. Technische Universität Berlin. EPoPPEA workshop

Statistical Analysis of Endorsement Experiments: Measuring Support for Militant Groups in Pakistan

Political Economics II Spring Lectures 4-5 Part II Partisan Politics and Political Agency. Torsten Persson, IIES

Predic'ng Armed Conflict Using Machine Learning. Graig R. Klein, Binghamton University Nicholas P. TatoneB, Columbia University

P(x) testing training. x Hi

A comparative analysis of subreddit recommenders for Reddit

Real- Time Wireless Control Networks for Cyber- Physical Systems

Quantum theory of scattering by a potential. Lecture notes 8 (based on CT, Sec4on 8)

Combining physical and financial solidarity in Asylum Policy: TRAQS with matching

Part 1B Paper 7: Political Philosophy - Democracy Lecture 1: Justifications for democracy. Chris Thompson

The Pursuit of Objec1vity in Science and Public Life

New features in Oracle 11g for PL/SQL code tuning.

A New Proposal on Special Majority Voting 1 Christian List

What makes people feel free: Subjective freedom in comparative perspective Progress Report

THE GREAT MIGRATION AND SOCIAL INEQUALITY: A MONTE CARLO MARKOV CHAIN MODEL OF THE EFFECTS OF THE WAGE GAP IN NEW YORK CITY, CHICAGO, PHILADELPHIA

Hierarchical Item Response Models for Analyzing Public Opinion

Dimension Reduction. Why and How

Incumbency as a Source of Spillover Effects in Mixed Electoral Systems: Evidence from a Regression-Discontinuity Design.

σ IηIη Andrew Askew Florida State University

Unit 03. Ngo Quy Nham Foreign Trade University

Migration and Tourism Flows to New Zealand

MSL. Mul'-Robot Manipula'on without Communica'on. Zijian Wang and Mac Schwager

Split Decisions: Household Finance when a Policy Discontinuity allocates Overseas Work

All s Well That Ends Well: A Reply to Oneal, Barbieri & Peters*

the notion that poverty causes terrorism. Certainly, economic theory suggests that it would be

Sector Discrimination: Sector Identification with Similarity Digest Fingerprints

Dynamic Games Lesson 4: Economic Aplica4ons. Universidad Carlos III

The Shadow Value of Legal Status --A Hedonic Analysis of the Earnings of U.S. Farm Workers 1

Hoboken Public Schools. Algebra II Honors Curriculum

Google App Engine 8/10/17. CS Cloud Compu5ng Systems--Summer II 2017

1-1. Copyright 2015 Pearson Education, Inc.

Objec&ves. Usability Project Discussion. May 9, 2016 Sprenkle - CSCI335 1

Review: SoBware Development

Hoboken Public Schools. AP Statistics Curriculum

Climate Change Around the World

Mineral Availability and Social License to Operate

1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants

Coverage tools Eclipse Debugger Object-oriented Design Principles. Oct 26, 2016 Sprenkle - CSCI209 1

National IDs in a developing country context with a focus on Africa

C2- SIM IN SIMPLE ENVIRONMENTS

TNAU Weather so. Weather Database cum weather analysis so.ware

REVEALING THE GEOPOLITICAL GEOMETRY THROUGH SAMPLING JONATHAN MATTINGLY (+ THE TEAM) DUKE MATH

ISO/IEC20000 Overview and Cer2fica2on Approach

The Missing Dimension of the Political Resource Curse Debate

QUANTIFYING GERRYMANDERING REVEALING GEOPOLITICAL STRUCTURE THROUGH SAMPLING

STATISTICAL GRAPHICS FOR VISUALIZING DATA

UC-BERKELEY. Center on Institutions and Governance Working Paper No. 22. Interval Properties of Ideal Point Estimators

Gender preference and age at arrival among Asian immigrant women to the US

Cluster Analysis. (see also: Segmentation)

In a recent article in the Journal of Politics, we

Minimum Spanning Tree Union-Find Data Structure. Feb 28, 2018 CSCI211 - Sprenkle. Comcast wants to lay cable in a neighborhood. Neighborhood Layout

Building Blocks of Research Process. Alan Monroe Chapter 2

When Do Voters Punish Corrupt Politicians? Experimental Evidence from Brazil

Combining national and constituency polling for forecasting

Multi-level modelling, the ecologic fallacy, and hybrid study designs

The Effects of Incumbency Advantage in the U.S. Senate on the Choice of Electoral Design: Evidence from a Dynamic Selection Model

Hybrid and electric vehicle ac0vi0es in Latvia. 37 th IA- HEV ExCo mee0ng, October 2012

Working Group In- progress Report to APNIC Member Mee9ng (AMM)

Candidate Faces and Election Outcomes: Is the Face-Vote Correlation Caused by Candidate Selection? Corrigendum

Biased but moderate voters

Hoboken Public Schools. College Algebra Curriculum

Appendix: Uncovering Patterns Among Latent Variables: Human Rights and De Facto Judicial Independence

CSG Jus(ce Center Massachuse2s Criminal Jus(ce Review

HANDS ACROSS BORDERS. An International Workshop on. Alterna(ve Mechanisms to Establish and Govern Transboundary Conserva(on Ini(a(ves

Understanding factors that influence L1-visa outcomes in US

DETERRENCE AND THE DEATH PENALTY. Commi%ee on Law and Jus0ce Na0onal Research Council of the Na0onal Academies

Do Individual Heterogeneity and Spatial Correlation Matter?

Introduction. The Politician and the Judge: Accountability in Government

The 10- Year Framework of Programmes on Sustainable Consump=on & Produc=on. * An Intergovernmental mandate * Introduction

Ocean Observatories Ini/a/ve Facili/es Board The Ocean Observatories Ini/a/ve Facility Board (OOIFB) provides independent input and guidance

Statistical Analysis of Corruption Perception Index across countries

Poli%cal Economy of Economic Inequality and Gender Inequality

Dialogue in U.S. Senate Campaigns? An Examination of Issue Discussion in Candidate Television Advertising

Corruption and business procedures: an empirical investigation

Balayneh Genoro Abire 1, G. Y. Sagar 2 1, 2 School of Mathematical & Statistical Sciences, Department of Statistics, Hawassa University, Hawassa,

Research Statement. Jeffrey J. Harden. 2 Dissertation Research: The Dimensions of Representation

Overview. Ø Neural Networks are considered black-box models Ø They are complex and do not provide much insight into variable relationships

The Determinants and the Selection. of Mexico-US Migrations

A Global Economy-Climate Model with High Regional Resolution

Events and Memes in Media- rich Social Informa7on Networks

Sequential vs. Simultaneous Voting: Experimental Evidence

THE EVALUATION OF OUTPUT CONVERGENCE IN SEVERAL CENTRAL AND EASTERN EUROPEAN COUNTRIES

A procedure to compute a probabilistic bound for the maximum tardiness using stochastic simulation

Session 2: The economics of location choice: theory

Random Forests. Gradient Boosting. and. Bagging and Boosting

Transla'ng public health research for policymakers and advocates

Transcription:

Applica'on of UQ Principles to Calibra'on, Sensi'vity, and Experimental Design Omar Knio Center for Material Genomics Mechanical Engineering and Materials Science Duke University SRI Center for Uncertainty Quan'fica'on in Computa'onal Science and Engineering Applied Mathema'cs and Computa'onal Science King Abdullah University of Science and Technology

Acknowledgment H.N. Najm, B.J. Debusschere, R.D. Berry, K. Sargsyan, C. SaQa, K. Chowdhary, F. Rizzi, M. Khalil Sandia Na'onal Laboratories, CA R.G. Ghanem U. South. California, Los Angeles, CA O.P. Le Maître CNRS, Paris, France Y.M. Marzouk Mass. Inst. of Tech., Cambridge, MA Work Supported by: Ø DOE Office of Advanced Scien'fic Compu'ng Research (ASCR), Scien'fic Discovery through Advanced Compu'ng (SciDAC) Ø Office of Naval Research Ø Defense Threat Reduc'on Agency Ø King Abdullah University of Science and Technology

Outline Ø Introduc'on Ø UQ Challenges in Materials Modeling Ø Forward UQ / Surrogates Ø Sensi'vity Analysis Ø Calibra'on? Ø Examples? Ø Op'mal Experimental Design

Forward Problem x y = f(x) y

Inverse and Forward Problems /)$%&0+ 2'3'#.&.30+!"#$%&'(")'*+,"-.*+ y = f(x) 1%&$%&+ 23.-45(")0+ x+ y+,.'0%3.#.)&+,"-.*+ z = g(x) + 6'&'+

Inverse and Forward Problems /)$%&0+!"#$%&'(")'*+,"-.*+ 1%&$%&+ 23.-45(")0+ x+ 2'3'#.&.30+ y = f(x) y+,.'0%3.#.)&+,"-.*+ z = g(x) + 6'&'+ z d+ Data uncertain'es lead to predic'on uncertain'es

Inverse and Forward Problems y d+ y ={f 1 (x), f 2 (x)7+87+f M (x)}+ /)$%&0+!"#$%&'(")'*+,"-.*+ 1%&$%&+ 23.-45(")0+ x+ 2'3'#.&.30+ y = f(x) y+,.'0%3.#.)&+,"-.*+ z = g(x) + 6'&'+ z d+ Data and model uncertain'es Inverse & Forward UQ Model validation & comparison, Hypothesis testing

Uncertainty Quan'fica'on UQ is the end- to- end es'ma'on and analysis of uncertainty in Ø models and their parameters assimila'on of experimental/observa'onal data model fibng and parameter es'ma'on Ø model predic'ons forward propaga'on of parametric uncertainty to model outputs Analysis, comparison and selec'on among alternate plausible models

Case for UQ Ø Assessment of confidence in computa'onal predic'ons Ø Valida'on and comparison of scien'fic/engineering models Ø Design op'miza'on, decision support Ø Use of computa'onal predic'ons for decision- support Ø Assimila'on of observa'onal data and model construc'on Ø Mul'scale and mul'physics model coupling

Valida'on Challenge Valida'on of a computa'onal model Ø Establish agreement between predic'on of quan''es of interest under given opera'ng condi'ons and empirical observa'ons Ø Establishing model validity requires error bars on computa'onal predic'ons Disagreement without error bars cannot be used to conclude that a par'cular model is not valid Disagreement within the range of uncertainty of the results can be due to parametric uncertainty

Sources of Uncertainty in Computa'onal Models Ø model structure par'cipa'ng physical processes governing equa'ons cons'tu've rela'ons Ø model parameters transport and thermodynamic proper'es cons'tu've rela'ons, equa'ons of state source term rate parameters Ø ini'al and boundary condi'ons Ø geometry Ø numerical errors Ø bugs Ø faults, data loss, silent errors

Forward propaga'on of parametric uncertainty Ø Forward model: Ø Local sensi'vity analysis and error propaga'on is ok for: small uncertainty low degree of non- linearity in Ø Non- probabilis'c methods Fuzzy logic Evidence theory y = f(x) Dempster- Shafer theory Interval math Ø Probabilis'c methods this is our focus Global sensi'vity analysis Probabilis'c UQ methods f(x)

Probabilis'c forward UQ Represent uncertain quan''es using probability theory Ø Random sampling, (Monte Carlo) MC, QMC, etc Generate random samples {x i } N i=1from the PDF of x, p(x) Bin the corresponding {y i } to construct p(y) f(x) Not feasible for computa'onally expensive slow convergence of MC/QMC methods very large N required for reliable es'mates Ø Build a cheap surrogate for f(x), then use MC Colloca'on interpolants Regression fibng Ø Galerkin methods Polynomial Chaos (PC) Intrusive and non- intrusive PC methods

Inverse UQ Es'ma'on of model/parametric uncertainty Ø Expert opinion, data collec'on Ø Regression analysis, fibng, parameter es'ma'on Ø Bayesian inference of uncertain models/parameters Sta's'cal inverse problem Bayesian framework for probability theory Bayes rule

Types of Uncertainty Ø Reducible uncertainty Variable has one par'cular value, but it is not known Reducible: by taking more measurements, we can get to know the value of the variable beger Examples: The mass of the planet Neptune Wind speed a par'cular loca'on and 'me Ø Irreducible uncertainty Aleatory uncertainty Intrinsic or inherent uncertainty: variable is random; different value each 'me it is observed Irreducible: taking more measurements will not reduce uncertainty in the value of the variable Examples: Variability in manufactured part dimensions Wind speed at a par'cular loca'on

Bayesian viewpoint Ø Concep'on of probability as a degree of belief or certainty Uncertain quan'ty Random Variable/Process Encompasses both reducible (epistemic) and irreducible (aleatory) uncertainty Ø Dis'nct from the frequen(st viewpoint Only aleatoric quan''es represented with probability theory Epistemic variables handled using non- probabilis'c methods Ø We will follow the Bayesian view: Probability represents degree of belief Any uncertain quan'ty can be represented using probability

Sta's'cal inverse problem Ø UQ in predic'ons requires knowledge of uncertainty in: the model model parameters, inputs These are available from prior knowledge and/or data Ø Inverse problem: g() y : model : predic'on observable, data : model parameters g( )=y

Challenges with inverse problem Ø Inverse problem solu'on is difficult g 1 oqen non- local, non- causal. Ø Inverse problems are typically ill- posed: No solu'on may match the data (existence) Many solu'ons may match the data (uniqueness) Ill- condi'oning or lack of stability Small changes in y can lead to large changes in Sensi'vity to noise

Noise and ill condi'oning Acceleration (m/s) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 True Input Forward Model + 5% noise 5 4 3 2 1 5 4 3 0 20 40 60 80 Time (s) Inverse Problem Solution 0 0 20 40 60 80 Time (s) Aster 2004 Acceleration (m/s) 2 1 0 1 2 3 4 0 20 40 60 80 Time (s)

Determinis'c methods Ø Regulariza'on and op'miza'on (e.g. least- squares) impose smoothness and posi'vity Ø Issues: Choice of regulariza'on parameters regulariza'on may lead to bias, inconsistency Choice of op'mal number of fit parameters No general means for handling nuisance parameter Es'ma'on of Uncertainty in inferred parameter values relies on assumed linearity of the model in the parameters

Bayes formula for parameter inference Data Model (fit model + noise): Bayes Formula: p( y) = posterior likelihood prior p(y )p( ) p(y) evidence y = f( )+ Prior: knowledge of prior to data Likelihood: forward model and measurement noise Posterior: combines informa'on from prior and data Evidence: normalizing constant for present context

Advantages of Bayesian methods Ø Formal means of logical inference and machine learning Ø Means of incorpora'on of prior knowledge/ measurements and heterogeneous data Ø Full probabilis'c descrip'on of parameters Ø General means of handling nuisance parameters through marginaliza'on Ø Means of iden'fica'on of op(mal model complexity Only as much complexity as is required by the physics, and no more Avoid fibng to noise

Prior Ø Prior p( ) comes from Physical constraints Prior data Prior knowledge Ø The prior can be uninforma)ve Ø It can be chosen to impose regulariza)on Ø Unknown aspects of the prior can be added to the rest of the parameters as hyperparameters

Prior modeling Ø Informa've prior Ø (Mostly) Uninforma've prior Improper prior Objec've prior Maxent prior Reference prior Jeffreys prior Ø The choice of prior can be crucial when there is ligle informa'on in the data rela've to the number of degrees of freedom in the inference problem Ø When there is sufficient informa'on in the data, the data can overrule the prior

Likelihood modeling I Ø Where does probability enter the mapping in p(y )? Ø Through a presumed error model Ø Example: Model: y y m = g( ) Data: Error between data and model predic'on: y = g( )+ Ø Model this error as a random variable Ø Example Error is due to instrument measurement noise Instrument has Gaussian errors, with no bias = N(0, 2 )! y

Likelihood modeling II Ø For any given, this implies y, N(g( ), or p(y, )= 1 (y g( )) 2 p exp 2 2 2 Ø Given N measurements, (y 1,y 2,...,y N ), and presuming independent iden'cally distributed (iid) noise y i = g( )+ i i N(0, L( )=p(y 1,y 2,...,y N, NY )= p(y i, ) i=1 2 ) 2 )

Likelihood modeling III Ø It is useful to use the log- Likelihood ln L( )= 1 2 N ln 2 N 2 ln 2 1 2 NX i=1 apple yi g( ) 2 Ø Frequently, signal noise amplitude is not constant e.g. varies with signal amplitude then ln L( )= 1 2 NX i=1 ln 2 i N 2 ln 2 1 2 NX apple yi g( ) 2 i=1

Likelihood modeling IV Ø This is frequently the core modeling challenge Error model: a sta's'cal model for the discrepancy between the forward model and the data composi'on of the error model with the forward model Ø Error model composed of discrepancy between data and the truth (data error) model predic'on and the truth (model error) Ø Mean bias and correlated/uncorrelated noise structure Ø Hierarchical Bayes modeling, and dependence trees p(, D) =p(,d)p( D) Ø Choice of observable constraint on Quan'ty of Interest?

Experimental data Ø Empirical data error model structure can be informed based on knowledge of the experimental apparatus Ø Both bias and noise models are typically available from instrument calibra'on Ø Noise PDF structure A coun'ng instrument would exhibit Poisson noise A measurement combining many noise sources would exhibit Gaussian noise Ø Noise correla'on structure Point measurement Field measurement

Posterior I p( y) / p(y )p( ) Con'nuing the above iid Gaussian likelihood example, consider also an iid Gaussian prior on λ with N(m, s 2 ) p( )= 1 p 2 s exp ( m) 2 2s 2

Posterior II Then the posterior is p( y) / exp( ky g( )k)exp( k mk) and the log posterior is ln p( y) = ky g( )k k mk + C Thus, the maximum a- posteriori (MAP) es'mate of λ is equivalent to the solu'on of the regularized least- squares problem argmin (ky g( )k + k mk) The prior plays the role of a regularizer

Exploring the posterior Ø Direct calcula'on generally not feasible, especially in high number of dimensions Ø Rely instead on sta's'cal approach, based on genera'ng a large number of samples: Efficiency is a problem, especially when model evalua'ons are expensive Address later through use of cheap surrogates Ø Overview of Markov Chain Monte Carlo Illustra'on based on simple line fibng example

Remarks Ø Always analyze: behavior of chains decay of autocorrela'on nuisance parameters Ø If you happen to have a (good) surrogate: can accelerate MCMC can use alterna've adjoint- like formalism (to es'mate MLE, spread, hyperparameters)

Adjoint based formalism Ø Going back to " Bayes rule NY p(, 2 1 (Mi T T ) / p i ) 2 # exp p(, 2 v 2 i i=1 Ø Taking logarithm NX apple L(, 2 (Mi T i ) 2 )= i=1 2v 2 i 2v 2 i + 1 2 ln(2 v2 i ) DX ln(p( d)) 2 ln(p( )) ln(p(v max )) ln(p(m)) d=1 Ø Both Adjoint and Hessian can be readily evaluated 2 )

Adjoint based formalism II Ø Parameters and hyper parameters can be found by minimizing cost func'onal: J (, 2 )= 1 2 (M T )T R 1 (M T )+ 1 2 ln R +ln S where R is a is a diagonal observa'on error covariance matrix and S is a diagonal matrix with entries given by the variances

Adjoint based formalism III Ø Deriva'ves of the cost func'on take the form: adjoint @ J (, 2 ) = A T R 1 (M T ) H(, @ 2J (, 2 )=@ 2 J = 2 ) = 1 1 @R (M T )T @ (M T )+1 2 2 {Tr(R 1 d )}D d=1 + 2 2 3 6 4 @2, J @2, 2J 7 5 @ 2 2, J @2 2, 2J Ø Solu'on can be readily found using line search algorithm Ø Assuming locally symmetric (Gaussian- like) distribu'on, Hessian at minimum provides es'mate of the spread of op'mal parameters

UQ Challenges in Materials Modeling Ø Mul'scale Ø Model error, mesh error Ø Simula'on cost Ø Predic'ons oqen subject to noise Ø Experimental data limita'ons, informa'on loss, mul'ple data sources Ø Coupled models