March 2010

Unsupervised Aggregation for Classification Problems with Large Numbers of Categories

digitado ⋅ 31 de March de 2010

Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a decision-maker to combine predictions from various sources. However, supervised data needed to fit an aggregation model is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Gaussian Process Latent Variable Model

digitado ⋅ 31 de March de 2010

We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear […]

Ver mais

Like 0

Liked Liked

technocracy

A Markov-Chain Monte Carlo Approach to Simultaneous Localization and Mapping

digitado ⋅ 31 de March de 2010

A Markov-Chain Monte Carlo based algorithm is provided to solve the simultaneous localization and mapping (SLAM) problem with general dynamical and observation models under open-loop control and provided that the map-representation is finite dimensional. To our knowledge this is the first provably consistent yet (close-to) practical solution to this problem. The superiority of our algorithm over alternative SLAM algorithms is demonstrated in a difficult loop closing situation.

Ver mais

Like 0

Liked Liked

technocracy

Learning Causal Structure from Overlapping Variable Sets

digitado ⋅ 31 de March de 2010

We present an algorithm name cSAT+ for learning the causal structure in a domain from datasets measuring different variables sets. The algorithm outputs a graph with edges corresponding to all possible pairwise causal relations between two variables, named Pairwise Causal Graph (PCG). Examples of interesting inferences include the induction of the absence or presence of some causal relation between two variables never measured together. cSAT+ converts the problem to a series of SAT problems, obtaining leverage from the […]

Ver mais

Like 0

Liked Liked

technocracy

State-Space Inference and Learning with Gaussian Processes

digitado ⋅ 31 de March de 2010

State-space inference and learning with Gaussian processes (GPs) is an unsolved problem. We propose a new, general methodology for inference and learning in nonlinear state-space models that are described probabilistically by non-parametric GP models. We apply the expectation maximization algorithm to iterate between inference in the latent state-space and learning the parameters of the underlying GP dynamics model.

Ver mais

Like 0

Liked Liked

technocracy

Sequential Monte Carlo Samplers for Dirichlet Process Mixtures

digitado ⋅ 31 de March de 2010

In this paper, we develop a novel online algorithm based on the Sequential Monte Carlo(SMC) samplers framework for posterior inference in Dirichlet Process Mixtures (DPM). Our method generalizes many sequential importance sampling approaches. It provides a computationally efficient improvement to particle filtering that is less prone to getting stuck in isolated modes. The proposed method is a particular SMC sampler that enables us to design sophisticated clustering update schemes, such as updating past trajectories of the particles in […]

Ver mais

Like 0

Liked Liked

technocracy

Guarantees for Approximate Incremental SVMs

digitado ⋅ 31 de March de 2010

Assume a teacher provides examples one by one. An approximate incremental SVM computes a sequence of classifiers that are close to the true SVM solutions computed on the successive incremental training sets. We show that simple algorithms can satisfy an averaged accuracy criterion with a computational cost that scales as well as the best SVM algorithms with the number of examples. Finally, we exhibit some experiments highlighting the benefits of joining fast incremental optimization and curriculum and active […]

Ver mais

Like 0

Liked Liked

technocracy

An Alternative Prior Process for Nonparametric Bayesian Clustering

digitado ⋅ 31 de March de 2010

Prior distributions play a crucial role in Bayesian approaches to clustering. Two commonly-used prior distributions are the Dirichlet and Pitman-Yor processes. In this paper, we investigate the predictive probabilities that underlie these processes, and the implicit “rich-get-richer” characteristic of the resulting partitions. We explore an alternative prior for nonparametric Bayesian clustering, the uniform process, for applications where the “rich-get-richer” property is undesirable. We also explore the cost of this new process: partitions are no longer exchangeable with respect […]

Ver mais

Like 0

Liked Liked

technocracy

A Potential-based Framework for Online Multi-class Learning with Partial Feedback

digitado ⋅ 31 de March de 2010

We study the problem of online multi-class learning with partial feedback: in each trial of online learning, instead of providing the true class label for a given instance, the oracle will only reveal to the learner if the predicted class label is correct. We present a general framework for online multi-class learning with partial feedback that adapts the potential-based gradient descent approaches (Cesa-Bianchi & Lugosi, 2006). The generality of the proposed framework is verified by the fact that […]

Ver mais

Like 0

Liked Liked

technocracy

Online Passive-Aggressive Algorithms on a Budget

digitado ⋅ 31 de March de 2010

In this paper a kernel-based online learning algorithm, which has both constant space and update time, is proposed. The approach is based on the popular online Passive-Aggressive (PA) algorithm. When used in conjunction with kernel function, the number of support vectors in PA grows without bounds when learning from noisy data streams. This implies unlimited memory and ever increasing model update and prediction time. To address this issue, the proposed budgeted PA algorithm maintains only a fixed number […]

Ver mais

Like 0

Liked Liked