On-Line Learning of Undirected Sparse n-grams

From Tetherless World Wiki

Jump to: navigation, search

KSL-02-02 +  redirect page

On-Line Learning of Undirected Sparse n-grams +  Has identifier

On-Line Learning of Undirected Sparse n-grams +  Ksl tr id

On-Line Learning of Undirected Sparse n-grams +  Number

On-Line Learning of Undirected Sparse n-grams

Bibtype  techreport

Has publishing details  2002

Has title  On-Line Learning of Undirected Sparse n-grams

Has where published  KSL-02-02

Has year  2002

Title  On-Line Learning of Undirected Sparse n-grams

Year  2002

Abstract  Intelligent autonomous agents in complex w Intelligent autonomous agents in complex worlds must be able to identify frequent patterns in the massive data streams of their lives. These patterns serve as general foundations for inferring unseen data and as building blocks for higher-level knowledge representation. We address this need with systems based on n-grams, simple learning models considered state-of-the-art in many sequential domains. Standard n-grams suffer from an exponential number of parameters in the model width, n. We introduce an undirected sparse n-gram model, which stores probability estimates only for some(frequent) n-tuples from the unconditional joint distribution and which outperforms narrower exhaustive n-grams with the same number of parameters.Techniques exist for pruning exhaustive n-grams after training but are inappropriate when on-line learning is required (as when data overwhelms storage capacity). We present an unsupervised, on-line learning algorithm for sparse n-grams which induces both model parameters and structure (which patterns to include) despite the challenging a priori uncertainty of which patterns are frequent. The resulting dynamic pattern inclusion complicates probability estimates and inference, but we present several solutions, including Bayesian and EM approaches. We also discuss multiwidth combinations of sparse n-grams, which are capable of hierarchically representing repeated substructure. They learn on-line with no fixed bound on pattern size, but still using less memory than that taken by the training data. Thus, they learn frequent patterns from data streams incrementally and at multiple size scales, without fitting parameters for all possible patterns and without remembering or iterating over training data. membering or iterating over training data.

Author  Karl Pfleger +

Has author  Karl Pfleger +

Has identifier  On-Line Learning of Undirected Sparse n-grams +

Institution  Knowledge Systems, AI Laboratory +

Ksl tr id  On-Line Learning of Undirected Sparse n-grams +

Number  On-Line Learning of Undirected Sparse n-grams +

Process note  NO +

Categories  KSL Technical Report +, Publication +, Technical Report +

 

Enter the name of the page to start browsing from.
Views
Personal tools