p(

where h

The tuning stage of the system is used to set the weights. Next, we detail the components (models) implemented in our system.

p(f

where e refers to target, f to input and (f,e)

Multiple bilingual n-gram language models can be used in decoding time. Bilingual LMs must be computed over any tuple factored form available to the decoder. Typically, tuples are built from raw words, but many other factored forms can be used (POS tags, lemmas, stems,

The tuple bonus model is used in order to compensate the system preference for sentences using less number of tuples (tuples with larger source side). It is implemented following the next equation:

p(f

p(f

where e refers to target, f to input and e

Multiple target n-gram language models can be used in decoding time. Each model must be computed over any tuple target side factored form available to the decoder.

p(f

p(f

where f

Multiple source n-gram language models can be used in decoding time. Each model must be computed over any tuple source side factored form available to the decoder.

p(f

p(f

Our current system implementation employs four different estimations for p((f,e)

- count(f,e) / Σ
_{e'}count(f,e') - count(f,e) / Σ
_{f'}count(f',e) - 1/(I+1)
^{J}(Π_{j=1}^{J}∑_{i=0}^{I}p_{lex}(e_{i},f_{j})) - 1/(J+1)
^{I}(Π_{i=1}^{I}∑_{j=0}^{J}p_{lex}(f_{j},e_{i}))

- (m) monotone order,
- (s) switch with previous phrase,
- (f) forward jump,
- (b) backward jump.

- (d) discontinuous (overlapped with (b) and (f)),
- (c) continuous (overlapped with (m) and (s))

In order to learn this reordering model, we count how often each extracted tuple is found with each of the four orientation types. The probability distribution p

Given the sparse statistics of the orientation types, we may want to smooth the probability distribution with some factor σ: p

where σ = 1 / Σ

The next example shows the reordering orientations (last column) computed over a sequence of tuples:

les ||| NULL ||| 0 ||| (m) opérations ||| operations ||| 1 ||| (m) contre ||| there against ||| 2 ||| (m) les ||| NULL ||| 3 ||| (m) des Talibans ||| Taliban ||| 5 6 ||| (f) et ||| and ||| 7 ||| (m) d' Al-Qaeda ||| al-Qaeda ||| 8 9 ||| (m) forces ||| forces ||| 4 ||| (b) ont obtenu ||| brought ||| 10 11 ||| (f) des ||| NULL ||| 12 ||| (m) mitigés ||| mixed ||| 14 ||| (f) résultats ||| result ||| 13 ||| (s) . ||| . ||| 15 ||| (f)A particular case considers tuples with non consecutive input words (see the third tuple in our example). In such case we arbitrary set to swap (s) the orientation of a unit that has words

qui ||| that ||| 40 ||| (m) vraiment ||| truly ||| 42 ||| (f) ont été ||| were ||| 41 43 ||| (s) terribles ||| dreadful ||| 44 ||| (m) . ||| . ||| 45 ||| (m)The reordering model may also be learned conditionned to the next tuple (not only the previous one). The same method is also used.

Where f is an original sequence of input words (or POS tags), and f* is a permutation (or reordering) of f.

p(f

where position(j) refers to the original position of the hypothesis input word j.