Showing 41 - 50 of 86
Traditional event models underlying naive Bayes classifiers assume probability distributions that are not appropriate for binary data generated by human behaviour. In this work, we develop a new event model, based on a somewhat forgotten distribution created by Kenneth Ted Wallenius in 1963. We...
Persistent link: https://www.econbiz.de/10013071438
The emergence of online paid micro-crowdsourcing platforms, such as Amazon Mechanical Turk (AMT), allows on-demand and at scale distribution of tasks to human workers around the world. In such settings, online workers come and complete small tasks posted by an employer, working for as long or as...
Persistent link: https://www.econbiz.de/10012937839
A data mining (DM) process involves multiple stages. A simple, but typical, process might includepreprocessing data, applying a data-mining algorithm, and postprocessing the mining results. Thereare many possible choices for each stage, and only some combinations are valid. Because of thelarge...
Persistent link: https://www.econbiz.de/10012766078
A knowledge discovery (KD) process involves pre- data, choosing a data-mining algorithm, and postprocessing the mining results. There are very manychoices for each of these stages, and non-trivial interactions between them. Consequently, both novicesand data-mining specialists need assistance in...
Persistent link: https://www.econbiz.de/10012766079
A data mining (DM) process involves multiple stages. A simple, but typical, process might includepreprocessing data, applying a data-mining algorithm, and postprocessing the mining results. Thereare many possible choices for each stage, and only some combinations are valid. Because of thelarge...
Persistent link: https://www.econbiz.de/10012766080
This paper addresses the classification of linked entities. Weintroduce a relational vector (VS) model (in analogy to theVS model used in information retrieval) that abstracts the linkedstructure, representing entities by vectors of weights. Givenlabeled data as background knowledge training...
Persistent link: https://www.econbiz.de/10012766083
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus especially on the improvement of training labels for supervised induction. With the outsourcing of...
Persistent link: https://www.econbiz.de/10012766133
This paper demonstrates that quot;social network collaborative filteringquot; (SNCF), wherein user-selected like-minded alters are used to make predictions, can rival traditional user-to-user collaborative filtering (CF) in predictive accuracy. Us-ing a unique data set from an online community...
Persistent link: https://www.econbiz.de/10012768374
Persistent link: https://www.econbiz.de/10012769152
Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to beweak or non-existent, which makes problem formulation open-ended by forcing us to consider a largenumber of independent variables and thereby increasing the dimensionality of the search...
Persistent link: https://www.econbiz.de/10012769780