Showing 11 - 20 of 767
The ``evidence'' procedure for setting hyperparameters is essentially the same as the techniques of ML-II and generalized maximum likelihood. Unlike those older techniques however, the evidence procedure has been justified (and used) as an approximation to the hierarchical Bayesian calculation....
Persistent link: https://www.econbiz.de/10005739998
Part I: Bayes Estimators and the Shannon Entropy. <p> This paper is the first of two on the problem of estimation a function of a probability distribution from a finite set of samples of that distribution. In this paper a Bayerian analysis of this problem is presented, the optimal properties of the...</p>
Persistent link: https://www.econbiz.de/10005623641
This paper presents two Bayesian alternatives to the chi-squared test for determining whether a pair of categorical data sets were generated from the same underlying distribution. It then discusses such alternatives for the Kolmogorov-Smirnov test, which is often used when the data sets consist...
Persistent link: https://www.econbiz.de/10005790642
This paper presents a Bayesian "correction" to the familiar quadratic loss bias-plus-variance formula. It then discusses some other loss-function-specific aspects of supervised learning. It ends by presenting a version of the bias-plus-variance formula appropriate for log loss.
Persistent link: https://www.econbiz.de/10005790667
For any real-world generalization problem, there are always many generalizers which could be applied to the problem. This chapter discusses some algorithmic techniques for dealing with this multiplicity of possible generalizers. All of these techniques rely on partitioning the provided learning...
Persistent link: https://www.econbiz.de/10005790777
As defined in MacLennan (1987), a {\it field computer} is a (spatial) continuum-limit neural net. This paper investigates field computers whose dynamics is also contiuum-limit, being governed by a purely linear integro-differential equation. Such systems are motivated both as as a means of...
Persistent link: https://www.econbiz.de/10005790877
This paper proves that for no prior probability distribution does the bootstrap (BS) distribution equal the predictive distribution, for all Bernoulli trails of some fixed size. It then proves that for no prior will the BS give the same first two moments as the predictive distribution for all...
Persistent link: https://www.econbiz.de/10005790883
In supervising learning it is commonly believed that penalizing complex functions help one avoid ``overfitting'' functions to data, and therefore improves generalization. It is also commonly believed that cross-validation is an effective way to choose amongst algorithms for fitting functions to...
Persistent link: https://www.econbiz.de/10005790885
The conventional Bayesian justification for backprop is that it finds the MAP weight vector. As this paper shows, to find the MAP i-o function instead, one must add a correction term to backprop. That term biases one towards i-o functions with small description lengths, and in particular favors...
Persistent link: https://www.econbiz.de/10005790886
Part II: Bayes Estimators for Mutual Information, Chi-Squared, Covariance, and other Statistics. <p> This paper is the second in a series of two on the problem of estimating a function of a probability distribution from a finite set of samples of that distribution. In the first paper, the Bayes...</p>
Persistent link: https://www.econbiz.de/10005790948