Hybrid Data-Driven Models of Machine Translation

This paper presents an extended, harmonised account of our previous work on combining subsentential alignments from phrase-based statistical machine translation (SMT) and example-based MT (EBMT) systems to create novel hybrid data-driven systems capable of outperforming the baseline SMT and EBMT sys... Ausführliche Beschreibung

1. Person: Groves, Declan
Weitere Personen: Way, Andy verfasserin
Quelle: in Machine translation : MT Vol. 19, No. 3/4 (2005), p. 301-323
Weitere Artikel
Format: Online-Artikel
Sprache: English
Veröffentlicht: 2005
Beschreibung: Online-Ressource
Schlagworte: research-article
Hybrid
Example-based MT
Statistical MT
Statistical language models
Convergence
Chunk coverage
Europarl corpus
Online Zugang: Volltext
Volltext
Tags: Hinzufügen
Keine Tags. Fügen Sie den ersten Tag hinzu!
Anmerkung: Copyright: Copyright 2006 Springer Science+Business Media B.V.
LEADER 03240nma a2200373 c 4500
001 JST055881718
003 DE-601
005 20180520051704.0
007 cr uuu---uuuuu
008 150324s2005 000 0 eng d
024 8 |a 20060486 
024 8 |a 10.1007/s10590-006-9015-5 
035 |a 20060486 
040 |b ger  |c GBVCP 
041 0 |a eng 
100 1 |a Groves, Declan 
245 1 0 |a Hybrid Data-Driven Models of Machine Translation  |h Elektronische Ressource 
300 |a Online-Ressource 
500 |a Copyright: Copyright 2006 Springer Science+Business Media B.V. 
520 |a This paper presents an extended, harmonised account of our previous work on combining subsentential alignments from phrase-based statistical machine translation (SMT) and example-based MT (EBMT) systems to create novel hybrid data-driven systems capable of outperforming the baseline SMT and EBMT systems from which they were derived. In previous work, we demonstrated that while an EBMT system is capable of outperforming a phrase-based SMT (PBSMT) system constructed from freely available resources, a hybrid 'example-based' SMT system incorporating marker chunks and SMT subsentential alignments is capable of outperforming both baseline translation models for French-English translation. In this paper, we show that similar gains are to be had from constructing a hybrid 'statistical' EBMT system. Unlike the previous research, here we use the Europarl training and test sets, which are fast becoming the standard data in the field. On these data sets, while all hybrid 'statistical' EBMT variants still fall short of the quality achieved by the baseline PBSMT system, we show that adding the marker chunks to create a hybrid 'example-based' SMT system outperforms the two baseline systems from which it is derived. Furthermore, we provide further evidence in favour of hybrid systems by adding an SMT target-language model to the EBMT system, and demonstrate that this too has a positive effect on translation quality. We also show that many of the subsentential alignments derived from the Europarl corpus are created by either the PBSMT or the EBMT system, but not by both. In sum, therefore, despite the obvious convergence of the two paradigms, the crucial differences between SMT and EBMT contribute positively to the overall translation quality. The central thesis of this paper is that any researcher who continues to develop an MT system using either of these approaches will benefit further from integrating the advantages of the other model; dogged adherence to one approach will lead to inferior systems being developed. 
653 |a research-article 
653 |a Hybrid 
653 |a Example-based MT 
653 |a Statistical MT 
653 |a Statistical language models 
653 |a Convergence 
653 |a Chunk coverage 
653 |a Europarl corpus 
700 1 |a Way, Andy  |e verfasserin  |4 aut 
773 0 8 |i in  |t Machine translation : MT  |d Dordrecht [u.a.] : Springer Science + Business Media B.V  |g Vol. 19, No. 3/4 (2005), p. 301-323  |q 19:3/4<301-323  |w (DE-601)JST055879357  |x 1573-0573 
856 4 1 |u https://www.jstor.org/stable/20060486  |3 Volltext 
912 |a GBV_JSTOR 
951 |a AR 
952 |d 19  |j 2005  |e 3/4  |h 301-323 

Ähnliche Einträge

Keine ähnlichen Titel gefunden

Privacy Notice Ask a Librarian New Acquisitions