Fix regex group counting for get_matches

Adding the * as character to match after parentheses, since it may be
used with backtrack verbs (e.g. (*FAIL), (*SKIP)). I guess this is a
very, very rare use case, but since the fix is easy, let's include it.
Also, added a ToDo since we should probably find a better way to count
capturing groups, although I cannot figure out any.

Change-Id: Idcb303b4740530af9d3f009414d35d68f59effd0
This commit is contained in:
Daimona Eaytoy 2018-11-01 10:21:36 +01:00
parent 917895b92c
commit 16475c0266

View file

@ -1062,11 +1062,14 @@ class AbuseFilterParser {
// Count the amount of capturing groups in the submitted pattern.
// This way we can return a fixed-dimension array, much easier to manage.
// ToDo: Find a better way to do this.
// First, strip away escaped parentheses
$sanitized = preg_replace( '/(\\\\\\\\)*\\\\\(/', '', $needle );
// Then strip starting parentheses of non-capturing groups
// (also atomics, lookahead and so on, even if not every of them is supported)
$sanitized = preg_replace( '/\(\?/', '', $sanitized );
// Then strip starting parentheses of non-capturing groups, including
// atomics, lookaheads and so on, even if not every of them is supported.
$sanitized = str_replace( '(?', '', $sanitized );
// And also strip "(*", used with backtracking verbs like (*FAIL)
$sanitized = str_replace( '(*', '', $sanitized );
// Finally create an array of falses with dimension = # of capturing groups
$groupscount = substr_count( $sanitized, '(' ) + 1;
$falsy = array_fill( 0, $groupscount, false );