The New Sizzle

Posted on by

In order to make your 4th of July more sizzlin’ (you’re welcome), the jQuery team is happy to announce that Sizzle, jQuery’s CSS selector engine, is better, faster, and more reliable than ever! Sizzle has received a substantial rewrite to be included with the release of jQuery 1.8.

First, credit should be given to Diego Perini for pointing me in the right direction as well as Samuel Lebeau for creating a project 3 years ago called Bouncer, a “fast bottom-up element matcher for Javascript”.

jQuery, along with Sizzle, was released in 2006, about 3 years after Simon Willison came out with getElementsBySelector, which pretty much set the stage for every selector engine we have today. As time went on, Sizzle was rewritten a few times for the sake of performance and more and more bugs were covered as the number of people using it increased.

During this time, other quite impressive selector engines were introduced, including but not limited to NWMatcher (by Diego), dojo.query, Slick, base2, qwery, and YUI. Though they all have their own strengths, NWMatcher and Dojo particularly stood out as exemplary engines. While neither is the fastest at every selection, they are both consistently fast for almost every selector. My goal was to achieve this same level of performance for Sizzle, retain all of the edge-case bug fixes that John and the bugs team have collected over the years, and cover even more bugs that were in the queue or were covered by other engines. I can now safely say this goal has been reached.

While I won’t say Sizzle is completely bug free or it is always the fastest in every situation, the reliability and performance gains are very competitive. http://jsfiddle.net/timmywil/s7rN4/ is a primitive test used to quickly observe the differences between some selector engines for several selectors (should be run in a browser with a console open).

What changed

Below is a reduced list of the main code differences between Sizzle in jQuery 1.7.2 and jQuery 1.8.

One compiled selector function

The selector parser compiles a selector into one function containing functions for each part of the selector. This means that for any given selector (excluding positional (POS) selectors such as :first or :eq(3)), the possible set of elements need only be checked once. This is predominantly where the major speed boost and heightened stability comes from.

Additionally, Sizzle maintains a cache of the most recently compiled functions. The cache has a maximum size (which can be adjusted but has a default) so you don’t get out-of-memory errors when using a lot of different selectors.

Note: this does not have an effect on simple selectors (ID-only, TAG-only, CLASS-only) because Sizzle has had shortcuts for these that defer to getElementByID, getElementsByTagName, and getElementsByClassName whenever possible. That wasn’t changed (except for the addition of a shortcut for element-rooted ID selectors), and those are still the fastest selectors. Any other selector will go through querySelectorAll if available or run through the compiler.

querySelectorAll and matchesSelector

With this latest rewrite, the code paths to querySelectorAll and matchesSelector are even better than before and yield excellent performance.

Some people ask why we need Sizzle at all, since modern browsers have querySelectorAll and matchesSelector and accept a wide range of CSS3 selectors. The problem is that every browser (not just IE) has a few bugs in these methods. Selector engines must know beforehand what these bugs are and bypass these methods before they return incorrect results (though not all of them do). Sizzle now has this covered.

In addition, querySelectorAll and matchesSelector do not know how to process jQuery selector extensions such as [attr!=value]. Any time you use a selector extension, Sizzle needs to handle the selection natively.

Improved selector validation

Validating selectors is a tricky business. To be too strict can be annoying, but to be too flexible can produce unexpected results. In the past, Sizzle has been both of these at different times for several use cases. The most recent changes are geared towards adhering to the W3C selectors specification as much as possible, but also allowing some things that the spec does not (such as having complex selectors within a :not() pseudo).

Specifically, we are matching all whitespace characters where necessary, including line feed, tab, carriage return, and form feed(http://www.w3.org/TR/selectors/#whitespace), validating identifiers and operators in attributes selectors (http://www.w3.org/TR/selectors/#attribute-selectors), and providing a character encoding that matches the spec (http://www.w3.org/TR/css3-syntax/#characters).

Combinators (space, ~, >, +)

Combinators can get very complicated, but the new strategy handles these with great poise. In the release of jQuery 1.8 beta (and at the jQuery conference this year), I claimed that Sizzle had improved support for combinators. While the accuracy was improved, I spoke too soon and, fortunately, this was pointed out by someone I only know as Yaffle on github. Apparently, for very large and deep documents, the original revisions were checking so many elements that it was causing a stack overflow for selectors that had several combinators. For each combinator, the number of elements checked went up exponentially in order to maintain possible matches. This was bad. Sizzle now takes care of that issue and gets very satisfying results.

Extensibility

Although most of the old API for Sizzle was not changed in this rewrite (except for the removal of the now unnecessary Sizzle.filter from the private API), there are a couple changes that make Sizzle even more extensible. The most common way to extend Sizzle is to add custom pseudo selectors. Now with the parser compiling a function of functions, you can receive more information when creating your custom selector. For example, within Sizzle, the implementation for the :not pseudo selector is very similar to

// Using the createPseudo function tells the compiler
//   to pass the pseudo argument, context, and whether the current context is xml
//   to the function passed to createPseudo and trusts
//   that a function to be used for filtering will be returned.
// Note: the use of createPseudo is only necessary for creating custom
//   pseudo selectors with arguments.
Sizzle.selectors.pseudos.not =
    Sizzle.selectors.createPseudo(function( selector, context, isXml ) {
        var matcher = Sizzle.compile( selector, context, isXml );
	return function( elem ) {
		return !matcher( elem );
	};
    });

This is the only breaking change in the public API given the new parser, but I think creating custom pseudos with arguments is now much cleaner. For more information, please refer to the Sizzle docs.

Perhaps some of you are thinking that it could be nice to pre-compile your own selectors. Well, you can. Sizzle.compile is exposed so you can cache your selectors before they get used. While compiling is still very fast without caching, you can make sure that step is skipped before the selection is ever run. Call compile with your selector and context

Sizzle.compile(“my>long>complicated:selector(poof)”, document);

and it’s added to the cache. You can even increase/decrease the size of the cache by setting Sizzle.selectors.cacheLength.

Note: The majority of users do not need to use the compiler as Sizzle will maintain a cache of recently compiled selectors. Overriding Sizzle.compile will have no effect on Sizzle as it maintains an internal reference to this method.

Get the code!

The code is now available in the git versions of jQuery and Sizzle. Expect jQuery 1.8 to be released sometime this month. Issues specifically for Sizzle can be filed on GitHub and, as always, any issues related to jQuery as a whole can be filed on our bug tracker. Try it out for yourself and let us know if you run into any problems. Enjoy and Happy Independence Day!

12 thoughts on “The New Sizzle

  1. Stifu on said:

    Am I the only one who has troubles reading these graphs on jsperf.com? The bars are tiny, and the colors numerous and close to each other. It took me a while to figure out the 2 purples were slightly different.

    That said, congratulations on the improvements.

  2. @Stifu: yea, browserscope can be hard to read when there are more than a few tests running. Some of the perf tests in this post are easier to read than others, but running the tests might shed some light on the results.

  3. Alexander Makarov on said:

    According to the test old sizzle performed better on “div”, “.unitTest”, “.test div:first-child” in latest stable Opera.

  4. @Alexander: Ok. First of all, tag-only(div) and class-only(unitTest) were not affected by this rewrite. Secondly, it depends on which test you are looking at. If you are looking at a perf test that does not have querySelectorAll nullified, the difference is negligible in Opera (and 1.8 now checks for a bug in Opera’s QSA that is not checked in 1.7.2). If you are looking at http://jsperf.com/sizzle-1-7-2-vs-1-8/4, bypassing querySelectorAll shows an improvement in all selectors in Opera 12, emphasizing the improved performance of the parser.

  5. Alexander Makarov on said:

    I was looking at http://jsperf.com/sizzle-1-7-2-vs-1-8. You’re right that difference isn’t that big where performance is worse so it can be anything from extra bugfixes to simple measurment errors. Revision 4 performs much better.

    What about these “CSS 3 Selectors tests”? These squares are all black and white for me. No green or red. Circles are OK.

  6. @Alexander: There are a few css3 selectors that Sizzle does not currently support for a few different reasons. These selectors are being considered for addition in 1.9 and that issue is being discussed here: https://github.com/jquery/sizzle/issues/44. If there is an issue with a currently supported css3 selector, we greatly appreciate creating a new GitHub issue with a test case that can reproduce the bug. Thanks!

  7. Alexander Makarov on said:

    Thanks for explaining me all these things and for the new and better selector engine. If I’ll find any bugs I’ll post these straight to github issues.

  8. Raul on said:

    Hello, I don’t know where to inform about this, but your RSS feed does not work anymore!

  9. Mark Jackson on said:

    Thanks for the update, I really enjoy having such a wonderful package to work with.

  10. @mathieu: It is still true that any selector extension will not be supported by querySelectorAll. However, I wouldn’t say you should avoid using them. There are certain use-cases that are more performance-sensitive, such as event delegation on hundreds of table cells. It is only for these more sensitive cases that I would start being concerned about excluding selector extensions from the selector. Most use cases do not require this fine-tuning.