Given a source sentence and a target sentence, textual alignment is the process of matching each segment of the source sentence with a segment of the target sentence. When the two sentences are in the same language, it is a called monolingual alignment.
- Word level alignment : Individual words of source sentence are aligned with individual words of the target sentence.
We are ready to begin the program
We are set to start the program
- Chunk level alignment : Chunks in the source sentence can be aligned with chunks in the target sentence. In this process, the first step involves segmenting the sentences into meaningful chunks.
[A man] [reclines] [with a baby in his lap]
[A man] [sits in a chair] [holding a baby]
Some common techniques used to do the above alignments :
- Brute force heuristic based techniques. Find the similarity between each chunk (or word) in source sentence with that in target and greedily assign pairings.
- CRFs formulations are commonly used to model alignment problems where the actually alignment is computed through the viterbi algorithm.
- Most recently attention based RNNs and CNNs have been used to solve this problem.