Skip to content

Commit

Permalink
Optimize regex in trimAdjacentBlankLines() method of ExtractedTextFor…
Browse files Browse the repository at this point in the history
…matter to prevent stack overflow

Closes 2247 issue

Signed-off-by: Iryna Kopchak <[email protected]>
  • Loading branch information
solenyk authored and i-kopchak committed Feb 14, 2025
1 parent 7bc79f5 commit 2572e69
Showing 1 changed file with 3 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,9 @@
* An instance of this formatter can be customized using the {@link Builder} nested class.
*
* @author Christian Tzolov
* @author Iryna Kopchak
*/
public final class ExtractedTextFormatter {
public class ExtractedTextFormatter {

/** Flag indicating if the text should be left-aligned */
private final boolean leftAlignment;
Expand Down Expand Up @@ -84,7 +85,7 @@ public static ExtractedTextFormatter defaults() {
* @return Returns the same text but with blank lines trimmed.
*/
public static String trimAdjacentBlankLines(String pageText) {
return pageText.replaceAll("(?m)(^ *\n)", "\n").replaceAll("(?m)^$([\r\n]+?)(^$[\r\n]+?^)+", "$1");
return pageText.replaceAll("(?m)^(?:\\s*\\r?\\n)+", "\n");
}

/**
Expand Down

0 comments on commit 2572e69

Please sign in to comment.