You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add another implementation of Levenshtein distance (#702)
* chore: add `edit_distance.rs` to DIRECTORY.md
* feat: implement Edit Distance algorithm
* chore(tests): add few more checks
* chore: rename files
- rename `src/string/levenshtein_distance.rs` to `src/string/levenshtein_distance/optimized_dp.rs`
- move and rename `src/dynamic_programming/edit_distance.rs` to `src/string/levenshtein_distance/naive_dp.rs`
* chore: rename `levenshtein_distance` function
* chore: update DIRECTORY.md
* chore: update `mod.rs` files
* chore: format code with `fmt`
* chore: update DIRECTORY.md
* feat: implement levenshtein distance in both naive and optimized version using DP
* chore: update DIRECTORY.md
* chore(tests): update tests
* ref: Refactor tests for Levenshtein distance calculation
- Consolidated test cases into a constant array for improved readability and maintainability
- Simplified test structure by removing macro-based test generation, enhancing code clarity
- Introduced a `run_test_case` function to encapsulate test logic, enhancing test function conciseness.
- Organized test suite into separate modules for naive and optimized implementations, promoting code organization.
---------
Co-authored-by: Piotr Idzik <[email protected]>
/// Calculates the Levenshtein distance between two strings using an optimized dynamic programming approach.
6
69
///
7
-
/// For a detailed explanation, check the example on Uncyclopedia: <https://en.wikipedia.org/wiki/Levenshtein_distance>\
8
-
/// (see the examples with the matrices, for instance between KITTEN and SITTING)
70
+
/// This edit distance is defined as 1 point per insertion, substitution, or deletion required to make the strings equal.
9
71
///
10
-
/// Note that although we compute a matrix, left-to-right, top-to-bottom, at each step all we need to compute `cell[i][j]` is:
11
-
/// - `cell[i][j-1]`
12
-
/// - `cell[i-j][j]`
13
-
/// - `cell[i-i][j-1]`
72
+
/// # Arguments
14
73
///
15
-
/// This can be achieved by only using one "rolling" row and one additional variable, when computed `cell[i][j]` (or `row[i]`):
16
-
/// - `cell[i][j-1]` is the value to the left, on the same row (the one we just computed, `row[i-1]`)
17
-
/// - `cell[i-1][j]` is the value at `row[i]`, the one we're changing
18
-
/// - `cell[i-1][j-1]` was the value at `row[i-1]` before we changed it, for that we'll use a variable
74
+
/// * `string1` - The first string.
75
+
/// * `string2` - The second string.
76
+
///
77
+
/// # Returns
78
+
///
79
+
/// The Levenshtein distance between the two input strings.
80
+
/// For a detailed explanation, check the example on [Uncyclopedia](https://en.wikipedia.org/wiki/Levenshtein_distance).
81
+
/// This function iterates over the bytes in the string, so it may not behave entirely as expected for non-ASCII strings.
19
82
///
20
-
/// Doing this reduces space complexity from O(nm) to O(n)
83
+
/// Note that this implementation utilizes an optimized dynamic programming approach, significantly reducing the space complexity from O(nm) to O(n), where n and m are the lengths of `string1` and `string2`.
21
84
///
22
-
/// Second note: if we want to minimize space, since we're now O(n) make sure you use the shortest string horizontally, and the longest vertically
85
+
/// Additionally, it minimizes space usage by leveraging the shortest string horizontally and the longest string vertically in the computation matrix.
23
86
///
24
87
/// # Complexity
25
-
/// - time complexity: O(nm),
26
-
/// - space complexity: O(n),
27
88
///
28
-
/// where n and m are lengths of `str_a` and `str_b`
0 commit comments