Optimal learning

Optimal learning

Powell, Warren B.
Ryzhov, Ilya O.

95,99 €(IVA inc.)

Learn the science of collecting information to make effective decisionsEveryday decisions are made without the benefit of accurate information. Optimal Learning develops the needed principles for gathering information to make decisions, especially when collecting information is time-consuming and expensive. Designed for readers with an elementary background in probability and statistics, the book presents effective and practical policies illustrated in a wide range of applications, from energy, homeland security, and transportation to engineering, health, and business.This book covers the fundamental dimensions of alearning problem and presents a simple method for testing and comparing policies for learning. Special attention is given to the knowledge gradient policy and its use with a wide range of belief models, including lookup table and parametric and for online and offline problems. Three sections develop ideas withincreasing levels of sophistication:• Fundamentals explores fundamental topics, including adaptive learning, ranking and selection, the knowledge gradient, and bandit problems • Extensions and Applications features coverageof linear belief models, subset selection models, scalar function optimization, optimal bidding, and stopping problems • Advanced Topics explores complex methods including simulation optimization, active learning in mathematicalprogramming, and optimal continuous measurements Each chapter identifies a specific learning problem, presents the related, practical algorithms for implementation, and concludes with numerous exercises. A related website features additional applications and downloadable software, including MATLAB® and theOptimal Learning Calculator, a spreadsheet-based package that provides an introduc­tion to learning and a variety of policies for learning. INDICE: Preface xvAcknowledgments xix1 The challenges of learning 11.1 Learning the best path 21.2 Areas of application 41.3 Major problem classes 121.4The different types of learning 131.5 Learning from different communities 161.6 Information collection using decision trees 181.6.1 A basic decision tree 181.6.2 Decision tree for offline learning 201.6.3 Decision tree for online learning 211.6.4 Discussion 251.7 Website and downloadable software 261.8 Goals of this book 26Problems 282 Adaptive learning 312.1 The frequentist view 322.2 The Bayesian view 332.2.1 The updating equations for independent beliefs 342.2.2 The expected value of information 362.2.3 Updating for correlated normal priors 382.2.4 Bayesian updating with an uninformative prior 412.3 Updating for non-Gaussian priors 422.3.1 The gamma-exponential model 432.3.2 The gamma-Poisson model 442.3.3 The Pareto-uniform model 452.3.4 Models for learning probabilities∗ 462.3.5 Learning an unknown variance∗ 492.4 Monte Carlo simulation512.5 Why does it work?∗ 542.5.1 Derivation of ∼— 542.5.2 Derivation ofBayesian updating equations for independent beliefs 552.6 Bibliographic notes57Problems 573 The economics of information 613.1 An elementary information problem 613.2 The marginal value of information 653.3 An information acquisition problem 683.4 Bibliographic notes 70Problems 704 Ranking and selection 714.1The model 724.2 Measurement policies 754.2.1 Deterministic vs. sequential policies 754.2.2 Optimal sequential policies 764.2.3 Heuristic policies 774.3 Evaluating policies 814.4 More advanced topics∗ 834.4.1 An alternative representation of the probability space 834.4.2 Equivalence of using true means and sample estimates 844.5 Bibliographic notes 85Problems 855 The knowledge gradient895.1 The knowledge gradient for independent beliefs 905.1.1 Computation 915.1.2 Some properties of the knowledge gradient 935.1.3 The four distributions of learning 945.2 The value of information and the S-curve effect 955.3 Knowledge gradient for correlated beliefs 985.4 The knowledge gradient for some non-Gaussian distributions 1035.4.1 The gamma-exponential model 1045.4.2 The gamma-Poisson model 1075.4.3 The Pareto-uniform model 1085.4.4 The beta-Bernoulli model 1095.4.5 Discussion 1115.5 Relatives of the knowledge gradient 1125.5.1 Expected improvement 1135.5.2 Linear loss∗ 1145.6 Other issues 1165.6.1 Anticipatory vs. experiential learning 1175.6.2 The problem of priors 1185.6.3 Discussion 1205.7 Why does it work?∗ 1215.7.1 Derivation of the knowledge gradient formula 1215.8 Bibliographic notes 125Problems 1266 Bandit problems 1396.1 The theory and practice of Gittins indices 1416.1.1 Gittins indices in the beta-Bernoulli model 1426.1.2 Gittins indices in the normal-normal model 1456.1.3 Approximating Gittins indices 1476.2 Variations of bandit problems 1486.3 Upper confidence bounding 1496.4 The knowledge gradient for bandit problems 1516.4.1 The basic idea 1516.4.2 Some experimental comparisons 1536.4.3 Non-normal models 1566.5 Bibliographic notes 157Problems 1577 Elements of a learning problem 1637.1 The states of our system 1647.2 Types of decisions 1667.3 Exogenous information 1677.4 Transition functions 1687.5 Objective functions 1687.5.1 Designing versus controlling 1687.5.2 Measurement costs 1707.5.3 Objectives 1707.6 Evaluating policies 1757.7 Discussion 1777.8 Bibliographic notes 178Problems 1788 Linear belief models 1818.1 Applications 1828.1.1 Maximizing ad clicks 1828.1.2 Dynamic pricing 1848.1.3 Housing loans 1848.1.4 Optimizing dose response 1858.2 A brief review of linear regression 1868.2.1 The normal equations 1868.2.2 Recursive least squares 1878.2.3 A Bayesian interpretation 1888.2.4 Generating a prior 1898.3 The knowledge gradient for a linear model 1918.4 Application to drug discovery 1928.5 Application to dynamic pricing 1968.6 Bibliographic notes 200Problems 2009 Subset selection problems 2039.1 Applications 2059.2 Choosing a subset using ranking and selection 2069.2.1 Setting prior meansand variances 2079.2.2 Two strategies for setting prior covariances 2089.3 Larger sets 2099.3.1 Using simulation to reduce the problem size 2109.3.2 Computational issues 2129.3.3 Experiments 2139.4 Very large sets 2149.5 Bibliographic notes 216Problems 21610 Optimizing a scalar function 21910.1 Deterministic measurements 21910.2 Stochastic measurements 22310.2.1 The model 22310.2.2 Finding the posterior distribution 22410.2.3 Choosing the measurement 22610.2.4 Discussion 22910.3 Bibliographic notes 229Problems 22911 Optimal bidding 23111.1Modeling customer demand 23311.1.1 Some valuation models 23311.1.2 The logit model 23411.2 Bayesian modeling for dynamic pricing 23711.2.1 A conjugate prior for choosing between two demand curves 23711.2.2 Moment matching for non-conjugate problems 23911.2.3 An approximation for the logit model 24211.3 Biddingstrategies 24411.3.1 An idea from multi-armed bandits 24511.3.2 Bayes-greedy bidding 24511.3.3 Numerical illustrations 24711.4 Why does it work?∗ 25111.4.1 Moment matching for Pareto prior 25111.4.2 Approximating the logistic expectation 25211.5 Bibliographic notes 253Problems 25412 Stopping problems 25512.1Sequential probability ratio test 25512.2 The secretary problem 26012.2.1 Setup 26112.2.2 Solution 26312.3 Bibliographic notes 266Problems 26613 Active learning in statistics 26913.1 Deterministic policies 27013.2 Sequential policiesfor classification 27413.2.1 Uncertainty sampling 27413.2.2 Query by committee 27513.2.3 Expected error reduction 27613.3 A variance minimizing policy 27713.4 Mixtures of Gaussians 27913.4.1 Estimating parameters 28013.4.2 Active learning 28113.5 Bibliographic notes 28314 Simulation optimization 28514.1 Indifference zone selection 28714.1.1 Batch procedures 28814.1.2 Sequential procedures 29014.1.3 The 0-1 procedure: connection to linear loss 29114.2 Optimal computing budget allocation 29214.2.1 Indifference-zone version 29314.2.2 Linear loss version 29414.2.3 When does it work? 29514.3 Model-based simulated annealing 29614.4 Other areas of simulation optimization 29814.5 Bibliographic notes 29915 Learning in mathematical programming 30115.1 Applications 30315.1.1 Piloting a hot air balloon 30315.1.2 Optimizing a portfolio 30815.1.3 Network problems 30915.1.4 Discussion 31315.2 Learning on graphs 31315.3 Alternative edge selection policies 31615.4 Learning costs for linear programs∗ 31715.5 Bibliographic notes 32416 Optimizing over continuous measurements 32516.1 The belief model 32716.1.1 Updating equations 32816.1.2 Parameter estimation 33016.2 Sequential kriging optimization 33216.3 The knowledge gradient for continuous parameters∗ 33416.3.1 Maximizing the knowledge gradient 33416.3.2 Approximating the knowledge gradient 33516.3.3 The gradient of the knowledge gradient 33616.3.4 Maximizing the knowledge gradient 33816.3.5 The KGCP policy 33916.4 Efficient global optimization 34016.5 Experiments 34116.6 Extension to higher dimensional problems 34216.7 Bibliographic notes 34317 Learning with a physical state 34517.1 Introduction to dynamic programming 34717.1.1 Approximate dynamic programming 34817.1.2 The exploration vs. exploitation problem 35017.1.3 Discussion 35117.2 Some heuristic learning policies 35217.3 The local bandit approximation 35317.4 The knowledge gradient in dynamic programming 35517.4.1 Generalized learning using basis functions 35517.4.2 The knowledge gradient 35817.4.3 Experiments 36117.5 An expected improvement policy 36317.6 Bibliographic notes 364Index 379

  • ISBN: 978-0-470-59669-2
  • Editorial: John Wiley & Sons
  • Encuadernacion: Cartoné
  • Páginas: 414
  • Fecha Publicación: 27/04/2012
  • Nº Volúmenes: 1
  • Idioma: Inglés