Showing 1 - 10 of 13
This paper addresses the repeated acquisition of labels for data itemswhen the labeling is imperfect. We examine the improvement (or lackthereof) in data quality via repeated labeling, and focus especially onthe improvement of training labels for supervised induction. With theoutsourcing of...
Persistent link: https://www.econbiz.de/10013115629
This paper presents a detailed discussion of problem formulation and data representation issues in the design, deployment, and operation of a massive-scale machine learning system for targeted display advertising. Notably, the machine learning system itself is deployed and has been in continual...
Persistent link: https://www.econbiz.de/10013086426
Traditional event models underlying naive Bayes classifiers assume probability distributions that are not appropriate for binary data generated by human behaviour. In this work, we develop a new event model, based on a somewhat forgotten distribution created by Kenneth Ted Wallenius in 1963. We...
Persistent link: https://www.econbiz.de/10013071438
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction. With the outsourcing of...
Persistent link: https://www.econbiz.de/10012766133
Persistent link: https://www.econbiz.de/10012769152
Matrix factorization is a popular technique for engineering features for use in predictive models; it is viewed as a key part of the predictive analytics process and is used in many different domain areas. The purpose of this paper is to investigate matrix-factorization-based dimensionality...
Persistent link: https://www.econbiz.de/10013022033
Many document classification applications require human understanding of the reasons for data-driven classification decisions: by managers, client-facing employees, and the technical team. Predictive models treat documents as data to be classified, and document data are characterized by very...
Persistent link: https://www.econbiz.de/10013080204
The emergence of online paid crowdsourcing platforms, such as Amazon Mechanical Turk (AMT), presents us huge opportunities to distribute tasks to human workers around the world, on-demand and at scale. In such settings, online workers can come and complete tasks posted by a company, and work for...
Persistent link: https://www.econbiz.de/10014156544
The increasing availability of massive data on users' online behavior presents exciting opportunities for business analytics. In particular, if we could model the distributions of interests of visitors to webpages (or websites), we could apply the result to applications including site...
Persistent link: https://www.econbiz.de/10014160776
A main goal of online display advertising is to drive purchases (etc.) following ad engagement. However, there often are too few purchase conversions for campaign evaluation and optimization, due to low conversion rates, cold start periods, and long purchase cycles (e.g., with brand...
Persistent link: https://www.econbiz.de/10014164324