Hi,
I found out about this package because Magento 2 is using it in its build process. The build process is very time consuming, and part this time is spent compressing Javascript files with JShrink. When I looked at the code it occurred to me that a different approach might make it much faster. This approach would be based on the pivotal use of the PHP function preg_replace_callback, which would process the entire Javascript file at once.
Here's a bit of (very incomplete) code I used to test if this would work:
$exp = "~(
/\*!.*?\*/ | # /* license */
/\*.*?\*/ | # /* comment */
//[^\n]* | # // comment
/(?:\\\\/|[^/])+/[dgimsuy]*[ ]*[,;\n)] | # erg exp: /(ape|monkey)\/banana/mi;
\"(?:\\\\\"|[^\"])*\" | # double quoted string
'(?:\\\\'|[^'])*' | # single quoted string
(?P<negatives1>--?)\s+(?P<negatives2>--?) | # a - --b a-- - b
(?P<positives1>\+\+?)\s+(?P<positives2>\+\+?) | # a + ++b a++ + b
(?:return|var) | # operator keyword
[ \t\n]+ # whitespace
)~xs";
$normalized = str_replace(["\r\n", "\r"], ["\n", "\n"], $js);
$result = preg_replace_callback($exp, function($matches) {
$match = $matches[1];
$first = $match[0];
switch ($first) {
case '"':
// remove line continuation
$string = str_replace("\\\n", "", $match);
return $string;
case ' ':
return '';
case "\n":
return '';
case "\t":
return '';
}
$firstTwo = substr($match, 0, 2);
switch ($firstTwo) {
case '//':
return '';
case '/*':
return '';
}
switch ($match) {
case 'var':
return 'var ';
case 'return':
return 'return ';
}
if (isset($matches['negatives1']) && $matches['negatives1'] !== "") {
return $matches['negatives1'] . " " . $matches['negatives2'];
}
if (isset($matches['positives1']) && $matches['positives1'] !== "") {
return $matches['positives1'] . " " . $matches['positives2'];
}
return $match;
}, $normalized);
So, basically, the outermost loop is replaced by a single preg_replace_callback. It is much faster because it implements the inner loop with C code (the implementation of preg_replace_callback is written in C), rather than PHP code.
Before I work this out in detail, I was wondering if you are open to this complete rewrite. I am willing to perform the complete rewrite myself, but there are also some points in which the new code will not be backwards compatible with the existing code, like checking if a regular expression is well formed and throwing a RuntimeException. This new code will not do that. Doubtlessly there will be some other minor points that are not completely backwards compatible. So the question is: are you willing to loose some of the compatibility for a signification increase in processing speed?
If you are interested, I would like the help of some beta testers to debug my code. I just had this idea, it doesn't mean I make no mistakes working it out.
Of course I can start a repository in my own domain, but then it would have to start its user base from the start. By rewriting JShrink it may benefit all of your users.
What do you think?