The Binary String-to-String Correction Problem

Date

2013-08-30

Authors

Spreen, Thomas D.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

String-to-String Correction is the process of transforming some mutable string M into an exact copy of some other string (the target string T), using a shortest sequence of well-defined edit operations. The formal STRING-TO-STRING CORRECTION problem asks for the optimal solution using just two operations: symbol deletion, and swap of adjacent symbols. String correction problems using only swaps and deletions are computationally interesting; in his paper On the Complexity of the Extended String-to-String Correction Problem (1975), Robert Wagner proved that the String-to-String Correction problem under swap and deletion operations only is NP-complete for unbounded alphabets. In this thesis, we present the first careful examination of the binary-alphabet case, which we call Binary String-to-String Correction (BSSC). We present several special cases of BSSC for which an optimal solution can be found in polynomial time; in particular, the case where T and M have an equal number of occurrences of a given symbol has a polynomial-time solution. As well, we demonstrate and prove several properties of BSSC, some of which do not necessarily hold in the case of String-to-String Correction. For instance: that the order of operations is irrelevant; that symbols in the mutable string, if swapped, will only ever swap in one direction; that the length of the Longest Common Subsequence (LCS) of the two strings is monotone nondecreasing during the execution of an optimal solution; and that there exists no correlation between the effect of a swap or delete operation on LCS, and the optimality of that operation. About a dozen other results that are applicable to Binary String-to-String Correction will also be presented.

Description

Keywords

transportation problem, np-hard, swap, deletion, binary string, Iversonian, target string, mutable string, longest common subsequence, lcs, binary string-to-string correction

Citation