Java matcher does not find match, even though the regex works separately

Java matcher does not find match, even though the regex works separately

I’m trying to get a ‘teaser’ of a given String and put it as value into a HashMap. With ‘teaser’ I mean a substring (max length 50 characters) ending a word boundary.
Here’s a code sample showing how I’m trying to do it:
import java.util.regex.*;

public class Test {
public static void main(String[] args) throws Exception {
final Pattern pattern = Pattern.compile(“(^.{0,50}\b)”);
final Matcher m = pattern.matcher(
“This is a long string that I want to find a shorter teaser for.”);
if (m.find()) {
System.out.println(“Found: ” + m.group(1));
} else {
System.out.println(“No match”);
}
}
}

I expected it to print:
Found: This is a long string that I want to find a

But instead it prints:
No match

If I test this regex seperately it does what it should – it finds a substring of value which has a max length of 50 characters and ends on word boundary. But if I debug it, m.find always gets me a false.
Any ideas how to solve this? (I’m focused on getting the teaser, not on using Matcher.find() 😉 )

Solutions/Answers:

Solution 1:

According to Oracle documentation on Characters \b is the escape sequence for backspace within a String. However you want \b the regex for word boundary so you need to change the slash to a literal slash, i.e. \\ so that Pattern.compile sees the \b

Pattern.compile("(^.{0,50}\\b)")

You can see this effect by calling .toCharArray() on a String

Single slash

System.out.println(Arrays.toString("\b".toCharArray()));
=> []

Double slash

System.out.println(Arrays.toString("\\b".toCharArray()));
=> [\, b]

References

Related:  Regular Expression to Replace All But One Character In String