Sift is a lightweight, type-safe Java library designed to construct complex Regular Expressions through a readable, object-oriented API. It eliminates the maintenance burden of cryptic string-based regex by applying SOLID principles to pattern construction.
Regular expressions are powerful but often become "write-only" code. Sift transforms regex creation into a structured coding process, ensuring that patterns are:
- Readable: The syntax mirrors natural language, making the intent of the pattern obvious.
- Type-Safe: A Type-State machine (
QuantifierStep->TypeStep->ConnectorStep) enforces grammatical correctness at compile-time. - Composable: Complex patterns are built by combining smaller, reusable
SiftPatternobjects. - Extensible: Adheres strictly to the Open/Closed Principle, allowing you to define custom domain grammars without modifying the core library.
Sift is modularized to keep your dependency graph clean.
dependencies {
// Core Engine: Fluent API for Regex generation (Zero external dependencies)
implementation 'com.mirkoddd:sift-core:1.1.0'
// Optional: Integration with Jakarta Validation / Hibernate Validator
implementation 'com.mirkoddd:sift-annotations:1.1.0'
}To strictly validate an entire string (e.g., a simple Username rule), use Sift.fromStart() and untilEnd():
import static com.mirkoddd.sift.Sift.*;
String regex = fromStart()
.letters()
.followedBy()
.atLeast(3).alphanumeric()
.untilEnd()
.shake();
// Result: ^[a-zA-Z][a-zA-Z0-9]{3,}$If you need to find a pattern inside a larger text (like scanning logs), use Sift.anywhere():
import static com.mirkoddd.sift.Sift.*;
import static com.mirkoddd.sift.SiftPatterns.*;
String priceRegex = anywhere()
.followedBy(literal("Cost: $"))
.followedBy().oneOrMore().digits()
.withOptional(
anywhere().followedBy('.').followedBy().exactly(2).digits()
)
.shake();
// Result: Cost: \$[0-9]+(?:\.[0-9]{2})?- Entry Points (
Sift)
-
fromStart(): Anchors the regex to the beginning of the string (^). -
anywhere(): Creates a free-floating pattern. -
wordBoundary(): Anchors to a word boundary (\b).
- Quantifiers (
QuantifierStep)
Define how many times an element should occur:
-
.exactly(n) -
.atLeast(n) -
.oneOrMore() -
.zeroOrMore() -
.optional() -
.withOptional(char | SiftPattern): Syntactic sugar to make an entire block or character optional.
- Character Types (
TypeStep)
Define what to match:
-
.digits():[0-9] -
.letters():[a-zA-Z] -
.lettersLowercaseOnly():[a-z] -
.lettersUppercaseOnly():[A-Z] -
.alphanumeric():[a-zA-Z0-9] -
.any():.(Dot) -
.followedBy(char | SiftPattern): Match literals or complex sub-patterns.
- Refinements (
ConnectorStep)
Modify existing character classes using logical intersection/subtraction:
// Lowercase letters excluding vowels
char[] vowels = {'a', 'e', 'i', 'o', 'u'};
String regex = anywhere()
.oneOrMore().lettersLowercaseOnly()
.excluding(vowels)
.shake();
// Result: [a-z&&[^aeiou]]+- Logic and Groups (
SiftPatterns)
Use static imports fromSiftPatternsfor advanced composition:
-
anyOf(SiftPattern...): Logical OR(?:A|B). -
capture(SiftPattern): Anonymous capturing group(...). -
capture(String, SiftPattern): Named capturing group(?<name>...). -
literal(String): Safely escapes plain text.
Sift natively integrates with the Jakarta Validation API (JSR-380). This prevents regex duplication across your codebase by allowing you to define rules as reusable providers.
- Define a Rule
Implement theSiftRegexProviderinterface:
public class NumericPinRule implements SiftRegexProvider {
@Override
public String getRegex() {
return Sift.fromStart()
.exactly(5).digits()
.untilEnd()
.shake();
}
}- Annotate your DTO
Use@SiftMatchdirectly on your fields. The validator engine will compile the pattern automatically.
import com.mirkoddd.sift.SiftMatch;
public class RegistrationDto {
@SiftMatch(
value = NumericPinRule.class,
message = "PIN must be exactly 5 digits"
)
private String pinCode;
}Sift is designed to be extended. You don't need to modify the library to add new domain-specific grammar. Just create methods that return a SiftPattern (a functional interface with a single .shake() method) and inject them using .followedBy() or .withOptional().
// 1. Create your custom extension
SiftPattern datePattern = Sift.anywhere()
.exactly(4).digits().followedBy('-')
.followedBy().exactly(2).digits().followedBy('-')
.followedBy().exactly(2).digits();
// 2. Use it in the fluent chain
String logRegex = Sift.fromStart()
.followedBy(SiftPatterns.literal("[INFO] "))
.followedBy(datePattern)
.untilEnd()
.shake();This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details