Can you please help on matching a pattern having potential linebreaks. On [PHPRegexLive](https://www.phpliveregex.com/), I use the regex pattern = {{\s*IF(.+)}}(.+){{\s*ENDIF}} on search string: before if....{{IF !empty('')}} <div class='h6 mt-4 mb-2 edit-btn-container'>About</div> {{ENDIF}} after if.... The result is fine, array[0] = entire {{IF <condition>}}...{{ENDIF}} string, array[1] = <condition>, and array[2] = whatever between {{IF <con>}} and {{ENDIF}}. The problem is when the entire {{IF <con>}}...{{ENDIF}} spans more than one line, such as before if....{{IF !empty('')}} <div class='h6 mt-4 mb-2 edit-btn-container'>About</div> {{ENDIF}} after if.... I tried different combinations of \n*, \n*\r*, etc, and s, m modifier but cannot get it to work.
Look into the preg_match ending modifiers 's' (Single line) and 'm' (Multiline) Also, instead of: IF(.+)}} I would try: IF(.*?)} the reason is the the first way the ending }} could match the }} that follows ENDIF}} .*? means a non-greedy match; i.e. it will stop at the first }} match
Btw try to avoid using "unlimited wildcars" like * and +. Use {1,n} instead of + and {0,n} instead of * where n is the number of max characters you expect to a positive match. The reason is unlimited search can lead very slow regex matches in case of big input data and where end results is not guaranteed. Also instead of . You can use a definite set of characters or a stopper. It seems like you never want to go further than { and/or } character so you can write [^}] and [^{] instead of the dot. And finally it is better to escape { and } as normally they used for ranges (see above). So the final regex might look a bit more obfuscated than yours but much safer and faster to use: Code: \{\{\s*IF([^}]{1,100})\}\}([^{]{1,100})\{\{\s{0,100}ENDIF\}\}
Since PHP has nested IF statements, a single regular expression that searches for an IF followed by an ENDIF would not be able to match arbitrary nested IFs properly. https://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns
You might wanna look into compiler theory with lex and yacc. You have a lexer that identifies individual grammar tokens, in your case IF, ENDIF, "{" ... and then uses a syntax specification to parse them into something meaningful (an abstract syntax tree in compiler / interpreter case). Not sure what you're trying to achieve but parsing grammars is not trivial, prepare for pain