HMM-Based Word Alignment in Statistical Translation
HMM-Based Word Alignment in Statistical Translation
S t e p h a n V o g e l H e r m a n n N e y C h r i s t o p h T i l l m a n n
L e h r s t u h l ffir I n f o r m a t i k V, R W T H A a c h e n
D-52056 A a c h e n , G e r m a n y
{vogel, n e y , t illmann}@inf ormat ik. rwth-aachen, de
Abstract
In this paper, we describe a new model
for word alignment in statistical trans-
lation and present experimental results.
The idea of the model is to make the
alignment probabilities dependent on the
differences in the alignment positions
rather than on the absolute positions.
To achieve this goal, the approach us-
es a first-order Hidden Markov model
(HMM) for the word alignment problem
as they are used successfully in speech
recognition for the time alignment prob-
lem. The difference to the time align-
ment HMM is that there is no monotony
constraint for the possible word order-
ings. We describe the details of the mod-
el and test the model on several bilingual
corpora.
1 Introduct ion
In this paper, we address the problem of word
alignments for a bilingual corpus. In the recent
years, there have been a number of papers con-
sidering this or similar problems: (Brown et al.,
1990), (Dagan et al., 1993), (Kay et al., 1993),
(Fung et al., 1993).
In our approach, we use a first-order Hidden
Markov model (HMM) (aelinek, 1976), which is
similar, but not identical to those used in speech
recognition. The key component of this approach
is to make the alignment probabilities dependent
not on the absolute position of the word align-
ment, but on its relative position; i.e. we consider
the differences in the index of the word positions
rather than the index itself.
The organization of the paper is as follows.
After reviewing the statistical approach to ma-
chine translation, we first describe the convention-
al model (mixture model). We then present our
first-order HMM approach in lull detail. Finally
we present some experimental results and compare
our model with the conventional model.
2 Review: Translation Model
The goal is the translation of a text given in some
language F into a target language E. For conve-
nience, we choose for the following exposition as
language pair French and English, i.e. we are giv-
en a French string f~ = fx …fj…fJ, which is to be
translated into an English string e / = el.. .ei. . .cl.
Among all possible English strings, we will choose
the one with the highest probability which is given
by Bayes’ decision rule:
a{ = argmax{P,.(c{lAa)}
q
= a rgmax {Pr(ejt) . l ‘ r ( f • l e [ )}
el ~
Pr(e{) is the language model of the target lan-
guage, whereas Pr(fJle{) is the string translation
model. The argmax operation denotes the search
problem. In this paper, we address the problem
of introducing structures into the probabilistic de-
pendencies in order to model the string translation
probability Pr(f~ le{).
3 Al ignment Models
A key issne in modeling the string translation
probability Pr(J’~le I) is the question of how we
define the correspondence between the words of
the English sentence and the words of the French
sentence. In typical cases, we can assume a sort of
pairwise dependence by considering all word pairs
(fj, ei) for a given sentence pair I.-/1[~’J’, elqlj’ We fur-
ther constrain this model by assigning each French
word to exactly one English word. Models describ-
ing these types of dependencies are referred to as
alignment models.
In this section, we describe two models for word
alignrnent in detail:
,. a mixture-based alignment model, which was
introduced in (Brown et al., 1990);
• an HMM-based alignment model.
In this paper, we address the question of how to
define specific models for the alignment probabil-
ities. The notational convention will be as fol-
lows. We use the symbol Pr(.) to denote general
836
probability distributions with (nearly) no Sl)eeitic
asSUml)tions. In contrast, for modcl-t)ased prol)–
ability distributions, we use the generic symbol
v(.).
3 . 1 A l i g n m e n t w i t h M i x t u r e D i s t r i | m t i o n
Here, we describe the mixture-based alignment
model in a fornmlation which is different fronl
the original formulation ill (B rown el, a[., 1990).
We will ,is(: this model as reference tbr the IIMM-
based alignments to lie 1)resented later.
The model is based on a decomposition of the
joint probability [br ,l’~ into a product over the
probabilities for each word J):
a
j = l
wheFe~ fo[‘ norll-la]iz;i, t i on 17(~/SOllS~ the 8elltC][ce
length probability p(J] l) has been included. The
next step now is to assutne a sort O[‘l,airwise inter-
act, ion between tim French word f j an(l each, F,n-
glish word ci, i = 1, . . . l . These dep(‘ndencies are
captured in the lbrm of a rnixtnre distritmtion:
1
p(J)le{) = ~_.p(i, fjlc I)
i=1
I
= ~_~p(ilj, l).p(fjle~)
i=1
Putting everything together, we have the following
mixture-based ntodel:
J l
r,'(fi!l~I) = p(JIO ‘ H ~_~ [~,(ilJ, l). ~,(j)led] (1)
j = l i=t
with the following ingredients:
• sentence length prob~d)ility: P(J l l ) ;
• mixture alignment probability: p ( i l j , I);
• translation probM)ility: p(f[e).
Assuming a tmifornl ~flignment prol)ability
1
.p(ilj, 1) = 7
we arrive at the lh’st model proposed t)y (Brown
et al., 1990). This model will be referred to as
IB M 1 model.
To train the translation probabilities p(J’fc), we
use a bilingual (;orpus consisting of sentence pairs
[ : / ‘ ;4″1 : ‘ , . , s Using the ,,laxin,ul , like-
lihood criterion, we ol)tain the following iterative
L a
equation (Brown et al., 1990):
/ ) ( f i e ) = ~ – will,
$’
A(f,e) = ~ 2 ~5(f,J).~) }~ a(e,e~.~)
For unilbrm alignment probabilities, it can be
shown (Brown et al., 1990), that there is only one
optinnnn and therefore the I,’,M algorithm (Baum,
1!)72) always tinds the global optimum.
For mixture alignment model with nonunilbrm
alignment probabilities (subsequently referred to
as IBM2 model), there ~tre to() many alignrnent
parameters Pill j , I) to be estimated for smMl c o l
pora. Therefore, a specific model tbr tile Mign-
ment in:obabilities is used:
r ( i – j ~ – ) (~) p( i l j , 1) = l . I
E i ‘ : l “( it –” J J-)
This model assumes that the position distance rel-
ative to the diagonal line of the (j, i) plane is the
dominating factor (see Fig. 1). ‘ lb train this mod-
el, we use the ,naximutn likelihood criterion in the
so-called ulaximmn al)proximation, i.e. the likeli-
hood criterion covers only tile most lik(-.ly align:
inch, rather than the set of all alignm(,nts:
d
P,'(f(I,:I) ~ II ~”IU HO, ~)v(J} I,:~)] (a)
j= l
In training, this criterion amounts to a sequence
of iterations, each of which consists of two steps:
* posi l ion a l ignmcnl : (riven the model parame-
ters, deLerlniim the mosL likely position align-
]lient.
• paramc, lcr c s t imal ion: Given the position
alignment, i.e. goiug along the alignment
paths for all sentence pairs, perform maxi-
tnulu likelihood estimation of the model pa-
rameters; for model-De(‘ distributions, these
estimates result in relative frequencies.
l)ue to the natnre of tile nfixture tnod(:l, there
is no interaction between d j a c e n t word positions.
Theretbre, the optimal position i for each posi-
tion j can be determined in(lependently of the
neighbouring positions. Thus l.he resulting train-
ing procedure is straightforward.
a . 2 A l i g n m e n t w i t h H M M
We now propose all HMM-based alignment model.
‘[‘he motivation is that typicMly we have a strong
localization effect in aligning the words in parallel
texts (for language pairs fi:om ]ndoeuropean lan-
guages): the words are not distrilmted arbitrarily
over the senteuce ])ositions, but tend to form clus-
ters. Fig. 1 illustrates this effect for the language
pair G e r m a n – 15’nglish.
Each word of the German sentence is assigned
to a word of the English sentence. The alignments
have a strong tendency to preserve the local neigh-
borhood when going from the one langnage to the
other language. In mm,y cases, although not al~
ways, there is an even stronger restriction: the
differeuce in the position index is smMler than 3.
8 3 7
DAYS
BOTH
ON
EIGHT
AT
IT
MAKE
CAN
WE
IF
THINK
I
WELL
+ + + + + + + + + j ~ + +
+ + + + + + + ~ J ~ + +
+ + + + + + + / + ÷ + . – .
+ + + + + + + / + + + + +
+ + + + + ~ x ~ + + + + +
+ + + + + / + D + + + + +
+ + + + ~ + + + + + + +
+ + + _ ~ + + + + + + +
+ + + ~ + + + + + + + +
+ + j g + + + + + + + + +
+ ~ + + + + + + + + + +
g + + + + + + + + + + +
z
aa
Figure 1: Word alignment for a German- English
sentence pair.
To describe these word-by-word aligmnents, we
introduce the mapping j —+ aj, which assigns a
word f j in position j to a word el in position
{ = aj. The concept of these alignments is similar
to the ones introduced by (Brown et al., 1990),
but we wilt use another type of dependence in the
probability distributions. Looking at such align-
ments produced by a hmnan expert, it is evident
that the mathematical model should try to cap-
ture the strong dependence of aj on the previous
aligmnent. Therefore the probability of alignment
aj for position j should have a dependence on the
previous alignment aj _ 1 :
p(a j i a j_ l , i ) ,
where we have inchided the conditioning on the
total length [ of the English sentence for normal-
ization reasons. A sinfilar approach has been cho-
sen by (Da.gan et al., 1993). Thus the problem
formulation is similar to that of the time align-
ment problem in speech recognition, where the
so-called IIidden Markov models have been suc-
cessfully used for a long time (Jelinek, 1976). Us-
ing the same basic principles, we can rewrite the
probability by introducing the ‘hidden’ alignments
af := al . . .a j . . .aa for a sentence pair If,a; e{]:
Pr(f~al es) = ~_,Vr(fal , aT[ eI’t,
a7
,1
= ~ 1-IP”(k,”stfT-‘,”{ -*,e/)
a I j=l
So far there has been no basic restriction of the
approach. We now assume a first-order depen-
dence on the alignments aj only:
Vr(fj,aslf{ -~, J-* a I , e l )
where, in addition, we have assmned that tile
translation probability del)ends only oil aj and not
oil aj-:l . Putting everything together, we have the
ibllowing llMM-based model:
a
Pr(f:i ‘le{) = ~ I-I [p(ajlaj – ‘ , l ) .p(Y)lea,)] (4)
af J=,
with the following ingredients:
• IlMM alignment probability: p(i]i’, I) or
p ( a j l a j _ l , I ) ;
• translation probabflity: p(f]e).
In addition, we assume that the t{MM align-
ment probabilities p(i[i’, [) depend only on the
jump width (i – i’). Using a set of non-negative
parameters { s ( i – i’)}, we can write the IIMM
alignment probabilities in the form:
4 i – i’) (5)
p(ili ‘ , i ) = E ‘ s(1 – i’)
1=1
This form ensures that for each word position
i’, i’ = 1, …, I, the ItMM alignment probabilities
satisfy the normMization constraint.
Note the similarity between Equations (2) and
(5). The mixtm;e model can be interpreted as a
zeroth-order model in contrast to the first-order
tlMM model.
As with the IBM2 model, we use again the max-
imum approximation:
J
Pr(fiSle~) “~ max]–[ [p(asl<*j-1, z)p(f j l<~,)] (6)
a ' / . l l . j,,,
j = l
In t h i s case , t he t a s k o f f i n d i n g t h e o p t i m a l
alignment is more involved than in the case of the
mixture model (lBM2). Thereibre, we have to re-
sort to dynainic programming for which we have
the following typical reeursion formula:
Q(i, j ) = p( f j lel) ,nvax [p(ili', 1 ) . Q(i ' , j - 1)]
i = l , . , , I
Here, Q(i, j ) is a sort of partial probability as
in time alignment for speech recognition (Jelinek,
197@.
4 E x p e r i m e n t a l R e s u l t s
4.1 T h e Task a n d t h e C o r p u s
The models were tested on several tasks:
• the Avalanche Bulletins published by the
Swiss Federal Institute for Snow and
Avalanche Research (SHSAR) in Davos,
Switzerland and made awtilable by the Eu-
p " q I ropean Corpus Initiative (I ,CI/MCI, 1994);
• the Verbmobil Corpus consisting of sponta-
neously spoken dialogs in the domain of ap-
pointment scheduling (Wahlster, 1993);
8 3 8
,, the EuTrans C, orpus which contains tyl)ical
phrases f rom the tourists and t.ravel docnain.
(EuTrans , 1996).
' l 'able ] gives the details on the size of tit<; cor-
pora a, ud t;]t<'it' vocal>ulary. It shottld I>e noted
tha t in a.ll thes(; three ca.ses the ratio el’ vocal)t,-
]ary size a.ml numl)er of running words is not very
faw)rable.
Tall)le, I: ( ,orpol :L
(,o~pt s l,angua.ge Words Voc. Size
AvalancJte
] A[ [ r a i l s
Verlmlobil
Frolt ch
(~(‘~ l la l l
Spanish
I,;nglish
( l e 11 a n
English
62849
,]4805
–1:77@-
15888
150279
25,] 27
1993
2265
2008
t 63(}
dO 17
2`]/13
For several years 1)et;weeu 83 and !)2, the
Avalanche Bulletins are awdlabte for I>oth Get-
ntan and I!’ren(;]l. The following is a tyl)ical sen–
t<;nce t>air fS;onl the <;or:IreS:
Bei zu('.rst recht holnm, Sl)~i.tev tM'eren ' l 'em-
l)eraJ, uren sind vou Samsta.g his 1)ienstag t n o f
gett auf
ha .uptkanml oberha lb 2000 m 60 his 80 cm
Neuschnee gel’aJlen.
l)ar des temp&’a tures d ‘ abord dlevdes, puis
plus basses, 60 h 8(1 cm de neige sent tombs
de samedi h. mard i mat in sur le versant herd
el; la eft’re des Alpes au-dessus de 2000 l[1.
An exa,nq)le fi’om the Vet%mobil corpus is given
in Figure 1.
4.2 T r a i n i n g a n d I L e s u l t s
l,;ach of the three COrlJora. were ttsed to train 1)oth
a l ignnmnt models , the mixture-I>ased a l ignment
model in Eq.(1) and the llMM-base
<:<)rims. In adclitiou t;o the total i>erl>lexity, whi<'.h
is the' globa.l op t imiza t ion criterion, the tables al-
so show the perplexit ies of the t rans la t ion prob-
abilities and of the a l ignment probabil i t ies . The
last line in Table 2 gives the perplexi ty measures
wh(m a.lJplying the rtlaxilnun| app rox ima t ion and
COml>uting the perph’~xity in t;]lis approx ima t ion .
These values are equal to the ones after initializing
the IBM2 and HMM models, as they should be.
From Ta,ble 3, we can see. tha t the mix tu re align-
ment gives slightly bet ter perplexi ty values for the
t ranslat ion l)roba.1)ilities, whereas the I IMM mod-
el produces a smaller perplexi ty for the a l ignment
l>rohal)ilities. In the calculatiot, of the, perplexi-
ties, th<' seld;en(;e length probal)il i ty was not in=
eluded.
Tahle 2: IBM I: Transla t ion, a, l igmnent and total
pert)h'~xil.y as a. f imction of' the i terat ion.
I terat ion Tra,nslatiotl. Alignrnent Tota l
0
1
2
9
10
99.36
3.72
2.67
t .87
1.86
20.07
20.07
20.07
20.07
20.07
1994.00
7/1.57
53.62
37.55
37.36
Max. 3.88 20.07 77.!)5
' l 'able 3 : '1 rans] ~+tion, al igmn en t and totaJ perplex-
ity as a function of the itcra.tion for the IBM2 (A)
and the I IMM model (13)
Iter. Tratmlat;i(m
A 0
l
2
3
,]
5
1~ 0
1
3
4
5
A ligniN.elJ t
3.88- 20.07
3.17 10.82
3.25 10.15
3.22 10.10
3.20 ] 0.06
3.18 10.05
3.88 20.07
3.37 7.99
3.46 6.17
;{./17 5.90
"Ld6 5.85
3.`]5 5.8,]
' l 'otal
77.95
34.27
33.03
32.48
32.18
32.00
77.95
26.98
2t .36
20.48
20.2/1
20.18
Anoth<2r inl;crc:sting question is whether the
IIMM alignntent model helps in finding good and
sharply fo('usscd word+to-word (-orres]Jondences.
As an (;xamf,1o, Table 4 gives a COmlm+rison of
the translatioJ~ probabi l i t ies p ( f l e) bctweett the
mix ture and the IIMM alignnw+nt model For the
(,e, u +l word Alpensiidhang. The counts of the
words a.re given in brackets. The, re is virLually no
,:lilfc~rc~nce between the t ranslat ion l.al>les for the
two nn)dels (1BM2 and I IMM). But+ itt general,
the tl M M model seems to giw’. slightly be t te r re-
suits in the cases of (;, t tna t COml+olmd words like
Alpcus’iidha’n,(I vcrsant sud des Alpes which re-
quire [‘u,tction words in the trattslation.
8 3 9
Table 4: Alpens/idhang.
IBM1 Alpes (684) 0.171
des (1968) 0.035
le (1419) 0.039
sud (416) 0.427
sur (769) 0.040
versant (431) 0.284
IBM2 Alpes (684) 0.276
sud (41.6) 0.371
versant (431) 0.356
HMM Alpes (684) 0.284
des (1968) 0.028
sud (416) 0.354
versant (431) 0.333
This is a result of the smoother position align-
ments produced by the HMM model. A pro-
nounced example is given in Figure 2. ‘She prob-
lem of the absolute position alignment can he
demonstrated at the positions (a) and (c): both
Schneebretlgefahr und Schneeverfrachtungen have
a high probability on neige. The IBM2 models
chooses the position near the diagonal, as this
is the one with the higher probability. Again,
Schneebrettgefahr generates de which explains the
wrong alignment near the diagonal in (c).
However, this strength of the HMM model can
also be a weakness as in the case of est developpe
ist … entstanden (see (b) in Figure 2. The
required two large jumps are correctly found by
the mixture model, but not by the HMM mod-
el. These cases suggest an extention to the HMM
model. In general, there are only a small number
of big jumps in the position alignments in a given
sentence pair. Therefore a model could be useful
that distinguishes between local and big jumps.
The models have also been tested on the Verb-
mobil Translation Corpus as well as on a small
Corpus used in the EuTrans project. The sen-
tences in the EuTrans corpus are in general
short phrases with simple grammatical structures.
However, the training corpus is very small and the
produced alignments are generally of poor quality.
There is no marked difference for the two align-
ment models.
Table 5: Perplexity results
(b) Verbmobil Corpus.
for (a) EuTrans and
Model Iter. Transl. Align. Total
IBM1 10 2.610 6.233 16.267
IBM2 5 2.443 4.003 9.781
HMM 5 2.461 3.934 9.686
IBM1 10 4.373 10.674 46.672
IBM2 5 4.696 6.538 30.706
ItMM 5 4.859 5.452 26.495
The Verbmobil Corpus consists of spontaneous-
ly spoken dialogs in the domain of appointment
scheduling. The assumption that every word in
the source language is aligned to a word in the
target language breaks down for many sentence
pairs, resulting in poor alignment. This in turn
affects the quality of the translation probabilities.
Several extensions to the current IIMM based
model could be used to tackle these problems:
* The results presented here did not use the
concept of the empty word. For the HMM-
based model this, however, requires a second-
order rather than a first-order model.
. We could allow for multi-word phrases in
both languages.
• In addition to the absolute or relative align-
ment positions, the alignment probabilities
can be assumed to depend on part of speech
tags or on the words themselves. (confer
model 4 in (Brown et al., 1990)).
5 Conclusion
In this paper, we have presented an itMM-based
approach for rnodelling word aligmnents in par-
allel texts. The characteristic feature of this ap-
proach is to make the alignment probabilities ex-
plicitly dependent on the alignment position of the
previous word. We have tested the model suc-
cessfully on real data. The HMM-based approach
produces translation probabilities comparable to
the mixture alignment model. When looking at
the position alignments those generated by the
ItMM model are in general much smoother. This
could be especially helpful for languages such as
German, where compound words are matched to
several words in the source language. On the oth-
er hand, large jumps due to different word order-
ings in the two languages are successfully modeled.
We are presently studying and testing a nmltilevel
HMM model that allows only a small number of
large jumps. The ultimate test of the different
alignment and translation models can only be car-
ried out in the framework of a fully operational
translation system.
6 Acknowledgement
This research was partly supported by the (]er-
man Federal Ministery of Education, Science, t{e-
search and Technology under the Contract Num-
ber 01 IV 601 A (Verbmobil) and under the Esprit
Research Project 20268 ‘EuTrans).
References
L. E. Baum. 11972. An inequality and associat-
ed maximization technique in statistical esti-
mation of probabilistic functions of a Markov
process. Inequalities, 3:1-8.
840
ENTSTANDEN
SCHNEEBRETTGEFAHI~
LOKALE
E R H E B L I C H E
EINE
M
2O00
OBERHALB
S C H N E E V E R F R A C H T U N G E N
DURCH
IST
GOTTHARDGEBIET
IM
UND
WALLIS
IM
ENTSTANDEN
S C H N E E B R E T T G E F A H R
LOKALE
ERHEBLICHE
EINE
M
2000
OBERHALB
S C H N E E V E R F R A C H T U N G E N
DURCH
IST
GOTTHARDGEBIET
IM
UND
W A L L I S
+ + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + + +
+ + + + + + + + + ~ +
+ + + + + + + + + / ~ +
+ + + + + + + + + t 4 4 + ++++++++;/;/#
+++++++j,+°
+ + -I- ~ + + +
+ + ~ + + + + + + +
+ + ~ + + + + + + + +
. ~ + + + + + + + + +
O + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + I
+ +
++++ +
+ + +
+ + – + + Mixture
+ j + + + + + + + + +
+ , + + + + + + + + +
+ + + + + + + + -t- + + + +
+ + + + +(b)+ + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + +
+ + + + + + 4 – + + + + + + + + + + + + + + + + + + + + + + ~
+ + + + + + + + + + + + + + ~ + ~ + + + + + + + + ~
+ + + + + + + + + + + + + + / l ~ – Q – g + + + ~ + + + + + + + + ~
+
++++++++1 /+++++++++++1 /++++_12 / , ,++
+ + + + + + + + + / + + + + + + + + + + + + y + + + / – – ‘ + + + H M M
+ + + + + + + + I + + + + + + + + + + + + O – – I ~ ‘ – g – g + + + + +
+ + + + ~ + + + + + + + + + + + + + + + + + + + +
+ + + ~ – t ~ t ~ – ~ + + + + + + + + + + + + + + + + + + + + + +
+ + ~ / ~ + + + + + + + + + + + + + ÷ ÷ + + ÷ ÷ + + + + + + +
+ ~ J + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ ~ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
° o
A
Figure 2: Alignments generated by the IBM2 and the HMM model.
Peter F. Brown, Vincent J. Della Piet, ra,
Stephen A. I)ella Pietra, and Robert 1, Mercer.
]993. The Mat, hema.tics of Statistical Machine
‘lS:unslalfion: Parameter Estimatiom (]omputa-
tional Linguistics, 19(2):26″)–31 1.
hlo Dagan, Ken Ctmreh, and William A. Gale.
1993. l{,obust 13ilingual Word Alignment, for
Machine Aided ‘l’rm~sl~ttion l’rocecdings of the
Workshop on Very Largc Corpora, C, oluml)us,
Ohio, 1-8
ECI/MC[: The European Corpus Initiative Mul-.
tilingual Corpus 1. [!)94. Association for Com-
pul;ational binguistics.
EuTrans. ‘l’he I)etinidon of a M’I’ ‘[‘ask. ‘l)eh-
nieal Report, I,~f]’rans Project 1996(I,’orth-
conuni,g), l)epto, de Sistemas Informaticos y
Computacion (DSIC), Universidad Politecnica
de Valencia.
Pascale I”ung, and Kenneth Ward Church. 11994.
K-vet: A flew N)proach [br aligning parMlel
texts. Proceedings of COLING 94, 1096-ll02,
Kyoto, Japan.
Frederik Jelinek. 1976. Speech Recognition by
Statistical Met;hods. Proceedings of the [l~l£1’],
Vol. 64,532-556, April 11976.
Martin Kay, and Martin RSscheisen. 1993. Text-
‘l}anslation Alignment. Computational Lin-
guistics, 19(1):121-142
Wolfgang Wahlster. t993. Verbmobil: ‘l?ransla-
tion of Face-to-Face Dialogs. Proceedings of the
MT’ Summit IV, ]27-135, Kobe, Japan.
841