Loading...
Thumbnail Image
Item

Improving sequence segmentation learning by predicting trigrams

van den Bosch,A.
Daelemans,W.
Abstract
Symbolic machine-learning classifiers are known to suffer from near-sightedness when performing sequence segmentation (chunking) tasks in natural language processing: without special architectural additions they are oblivious of the decisions they made earlier when making new ones. We introduce a new pointwise-prediction single-classifier method that predicts trigrams of class labels on the basis of windowed input sequences, and uses a simple voting mechanism to decide on the labels in the final output sequence. We apply the method to maximum-entropy, sparse winnow, and memory-based classifiers using three different sentence-level chunking tasks, and show that the method is able to boost generalization performance in most experiments, attaining error reductions of up to 51%. We compare and combine the method with two known alternative methods to combat near-sightedness, viz. a feedback-loop method and a stacking method, using the memory-based classifier. The combination with a feedback loop suffers from the label bias problem, while the combination with a stacking method produces the best overall results.
Description
Pagination: 8
Date
2005
Journal Title
Journal ISSN
Volume Title
Publisher
ACL
Research Projects
Organizational Units
Journal Issue
Keywords
Citation
van den Bosch, A & Daelemans, W 2005, Improving sequence segmentation learning by predicting trigrams. in I Dagan & D Gildea (eds), Proceedings of the Ninth Conference on Natural Language Learning, CONLL-2005, June 29-30. ACL, Ann Arbor, MI, pp. 80-87.
License
info:eu-repo/semantics/restrictedAccess
Embedded videos