Selector Speeds

Posted on by

Note: Jack went ahead and fixed virtually everything mentioned in this post – great job!


We’ve been holding off on talking about the speed of the jQuery selectors for the new 1.1 release until our release was closer to being ready – however, it seems as if that process has already been expedited. So with that already out of the bag, lets look at the selector speeds of jQuery.

In short: For jQuery 1.1 we worked really really hard to make its selectors really fast. In fact, according to all our tests we were faster than any other selector library. Working on the 1.1 release, Dean Edwards’ cssQuery far out-performed any other selector library. It’s really comprehensive, and really fast.

Today, Jack Slocum announced his new DOMQuery selector library. In short: The bar has been raised. His library is very very fast. Quite possibly the fastest available today.

However, in the comparison between his library and ours, some mistakes were made that we’d like to clear up. (By both Jack and jQuery) (For reference, here’s the comparision suite that I used for my tests.)

jQuery completely supports all attribute selectors.
For example, [@foo], [@foo=bar], etc. The notable difference is that jQuery uses the XPath-style syntax in this situation. Since this was not accounted for in Jack’s test, it appeared as if we failed for all of the attribute selector tests.

Our “elem:empty” works just fine.
You can see in Jack’s test that all selectors (but DOMQuery) fail :empty – that’s more due to the fact that he compares the results against DOMQuery, which gets the result wrong. The specification states that something is empty if it doesn’t contain any child elements or text nodes. That doesn’t seem to be accounted for in this case.

[foo!=bar], :first, :last aren’t part of any specification.
…and yet they’re in the test suite. Incidentally, jQuery does implement :first and :last – but not [foo!=bar] (which appears to be only in cssQuery?). In all, its very strange to compare yourself against others when its not something that you’re designed to do.

What does span:not(span.hl-code) match?
This is a strange gray area that I haven’t seen talked about anywhere, and the specification doesn’t help to clear it up at all. Should the resulting set be all spans that don’t have a class of hl-code – or nothing, since you’ve filtered out all the spans. For example:

// Finds nothing in both
span:not(span)
=> []

// Finds spans that don't have a class of 'foo', in both
span:not(.foo)
=> [ <span>, <span>, ... ]

// jQuery's interpretation of the combination:
$("span:not(span.foo)")
=> []

// DOMQuery's interpretation of the combination:
Ext.select("span:not(span.foo")
=> [ <span>, <span>, ... ]

We’ll fully concede that we may be very wrong on this point – but I’m curious to hear what others have to say, and what they’re interpretations of the spec, are.

DOMQuery doesn’t account for duplicates.
Currently, doing Ext.select(“div div”) returns MORE elements than doing just Ext.select(“div”) – and doing Ext.select(“div div div”) returns yet another different set of elements, but still more than just doing Ext.select(“div”). In fact, accounting for duplicates is a huge problem in JavaScript selector libraries – and currently, jQuery is the only one that gets it right.

A big point of this is that accounting for duplicates can be really expensive (computationally) – so the fact that DOMQuery doesn’t account for duplicates gives the appearance of a speed boost. For example:

// DOMQuery
Ext.select("div").elements.length
=> 246
Ext.select("div div").elements.length
=> 624
Ext.select("div div div").elements.length
=> 523

// jQuery
jQuery("div").length
=> 246
jQuery("div div").length
=> 243
jQuery("div div div").length
=> 239

DOMQuery doesn’t support multiple filters: elem.foo[foo=bar] or elem.foo.bar
Until this is implemented, a comparison with any other library simply isn’t fair. Building a library that’s fully capable of handling aspects like that (see: cssQuery, jQuery) comes at a great cost. (Whether it be in code size or speed cost.)

DOMQuery’s #id selectors don’t check for context
You’ll notice if you try to do a query like:

Ext.select("div #badid").elements
=> [div#badid]

That you’ll get an element by “badid” — even if that element isn’t actually inside of a div. Since no check for validity is actually made in the DOMQuery code, it’s blazing fast – and very wrong.

I should mention that until 1.1, jQuery was wrong on this point too, so its an easy issue to overlook.

Where’d the root element go?
You’ll find that searches for “html” and “*” in DOMQuery are strangely missing one obvious thing: The HTML element. seems kind of weird to exclude the root DOM element from all queries; especially since this is perfectly valid: “html > body *”.

…and to make it fair – here’s one for us :-)

Our :nth-child(even/odd) is flawed.
Currently it seems to only select one element (!?). I made a ticket for this and it should be resolved for this Sunday’s 1.1 release.

In all, its great to see the speed leaps that’ve been made by DOMQuery. Selector speed is one area where competition is truly warranted; every time a new speed increase is made, everyone wins (users, developers – everyone).

In fact, looking over his code, I already have some more ideas on how to increase the speed of jQuery!

So, to Jack: Thanks for helping to keep us on our toes – we’re looking forward to seeing your library improve, and everyone win.

33 thoughts on “Selector Speeds

  1. Blair Mitchelmore on said:

    I’ve always felt that jQuery’s interpretation of the :not feature was weird though I admit I haven’t read the specs. It seems to me that it would filter out elements which match the entire selector provided not every element which matches any aspect of the selector.

    My two cents.

  2. Nate Cavanaugh on said:

    Spot on, John!

    Thanks a ton for posting this, as it’s always great to see comparisons between jQuery’s selector capabilities, and others.

    I have to say, though, that I agree with domQuery’s use of :not().
    I think that $(“span:not(span.foo)”) should return all of the spans that are not of class foo.
    The reason why, IMHO, is that :not is a filtering method, and it should be intelligent to know if you’re already selecting all spans, you don’t want to ignore all spans.

    Just my opinion.

    Also, IMHO, I prefer to use the regular attribute selector syntax, as opposed to the XPATH syntax, and the reason is because of standards. However, it really is a bit of a non-issue.

    As for domQuery in general, I am always a little disappointed by the plethora of options available, just because it can end up causing more problems as far as simplicity, but at the same time, it also offers more options to developers, and promotes a friendly competition, whereby the best of everyone comes out.

    Overall, though, jQuery is the best Javascript library around, not just the best Selector library. jQuery does so much more, and it truly is a full fledged toolkit, that finally bridges the gap between the Style/Formatting Layer (CSS), and the Behavior layer (Javascript).

    I am really excited for this weekend :)

  3. thx, a lot for your hard work. i have tested the new version 1.1b with a script of mine, that has to select a lot of td´s in a huge table, when hovering over one td (to colorize the “hovered” column, not just the row). the speed up between both versions is amazing.

    i´m looking forward to the “final” version 1.1. and of course jquery is the best….

  4. I always enjoy things like this. Quality and speed evaluations can be exciting when it comes to new releases.

    In regards to the :not statement, I want to side with jQuery. If you’ve already specified that you are filtering spans then you shouldn’t include spans in your query. It doesn’t make sense to say I want to eat all the apples in the basket that are not apples and not red, you’ll go hungry.

  5. On second thought, the logic would be more like I want to eat all the apples in the basket that are apples and red. It’s more of an unneeded redundancy which still follows that you shouldn’t include spans in your query when you’ve already stated that you are filtering spans.

  6. Yehuda on said:

    Kyle:

    jQuery’s implementation is saying: I want to eat all the apples in the basket that are not apples or not red (nonsensical)

    domQuery’s implementation is saying: I want to eat all the apples in the basket that are not red apples (more sensical)

    You are correct that it would make more sense to say: I want to eat all the apples in the basket that are not red.

    The bottom line is that to most people “div.class” is the same thing as “red apple.” As a result they *expect* domQuery’s version.

  7. Nate Cavanaugh on said:

    @Yehuda: Exactly! I believe it should be geared more towards what people expect, ESPECIALLY in cases of spec ambiguity.

    I would also just like to say that domQuery vs. jQuery or domQuery vs. MooTools is pretty misleading.

    Both jQuery and MooTools are lightweight libraries, that are extremely powerful, yet small.

    domQuery vs. Prototype’s or MochiKit’s $$ is more appropriate, because even though Jack says his domQuery script is 6k, it does not take into account the insane amount of code you’ll have to reference it inside of.

    You’ll need at the bare-minimum: These files: yahoo.js (4.73kb), yahoo-dom-event.js(27.3kb), yui-ext-core.js(53.7kb), AND the domQuery file (6kb).

    That’s a MINIMUM of 91.7kb just to get Selector expressions.

    Which is why, IMO, it makes more sense when compared to those “other” frameworks.

    So not only does jQuery have a speed and functionality advantage of domQuery, it’s footprint is 6 times smaller.

    All credit in the world to Jack for giving the YUI folks some really fast and really expressive CSS Selector expressions, but for him to compare it to compare it jQuery is a tad bit misleading…

  8. I have to go with DOMQuery on the :not() issue. CSS selectors combine to form ‘AND’ conditions. So, span:not(span.foo) is in JS syntax:

    if (tagname == ‘span’ && !(tagname == ‘span’ && classname = ‘foo’))

    which can be simplified to if (tagname == ‘span’ && classname != ‘foo’).

    If you wanted to filter out both all spans and all elements with a class, I would imagine something like this (not sure if this is legal CSS):

    span:not(span, .foo)

    Good job on the rest, and yes, jQuery does look like the most correct library :).

  9. I don’t understand why ‘span’ should even be in the parentheses at all. If you’re already selecting spans, it would be redundant to specify that the class is under a span.

    As far as the apple comparison, I don’t think any of the above examples are right. I would read them more as this:

    $(“span:not(.foo)”) : all apples that are not red
    $(“span:not(span.foo)”) : all apples that are apples and not red (redundant)

  10. Jörn Zaefferer on said:

    Steven, I guess the last example would be:

    span:not(span):not(.foo)

    Oviously that doesn’t make much sense.

    About the attribute selectors (Nate commented on them): jQuery’s square bracket syntax allows more then only attributes. Consider this example:

    input[@name=Peter]

    It says “select all input elements with a name attribute of value “Peter”. Now check this:

    form[input]

    It selects all forms that contain input elements. In other words: The [selector] syntax says “has a…”. And you can even nest these:

    form[input[@type=checkbox]]

    Selecting all forms that contain input elements of type checkbox.

    That wouldn’t be possible with CSS’ plain attribute selectors (without the @).

  11. Romulo on said:

    span:not(span.foo) isn’t legal. According to CSS3 specification:

    E:not(s): an E element that does not match simple selector s

    And the definition of simple selector:

    A simple selector is either a type selector [p, a, tr etc], universal selector [*], attribute selector [[foo]], class selector[.foo], ID selector [#foo], content selector [?], or pseudo-class [:hover]. One pseudo-element [::first-line] may be appended to the last sequence of simple selectors.

    ( Text inside square brackets are examples made by me. The last line refers to a collection of simple selectors, not just one as :not() demands )

    So you can see that span:not(span.foo) is invalid, but span:not(span), spam:not(.foo) and span:not(span):not(.foo) are all OK.

  12. Jack Slocum on said:

    John, thank you for the response. I have made a response on my blog.

    http://www.jackslocum.com/blog/2007/01/12/domquery-in-response-to-jquerys-response/

    @Nate C
    You are way off. As I noted on the end of my original post, the only thing DomQuery needs is a getStyle and Template function. Both are used in only 1 spot and easily swapped to put your own in. Those 2 combined equal under 2k, which would bring the total to 8kb. Do your research before making such a definitive statement.

  13. Greg Lewis on said:

    John the Propagandist, you got schooled in Jack’s response. Does it hurt that he is smarter and has prettier UIs? Oh, and Nate, you’re an idiot.

  14. Wow that was a classy reply there Greg. You scored some big points for Jack with that gibberish. BTW, did you call out Dean Edwards as well or is this a John-specific issue that you have? Dean wasn’t too happy either so I’m wondering what your motivation is other than trolling.

  15. Tommy Maintz on said:

    @Greg:

    This is what we don’t want to see. People flaming each other.
    I am a big fan of Jack’s work and his new selector class has set the standards alot higher. This doesn’t mean jQuery isn’t a really good library.

    I hope jQuery does learn by Jack’s class however that their speed can be boosted alot by some adjustments to their code structure.

  16. ReyBango on said:

    @Tommy: Thanks for your comments. Greg’s posting was completely childish and obviously an attempt to start something up for no good reason. We’re definitely going to learn from Jack’s work and he’s done some amazing things with EXT. Kudos to you for your considerate reply and to Jack for his great work.

    @Kevin: Thanks man. We’re definitely passionate about this project and ill continue to improve every aspect of the library.

  17. In retrospect, I disagree with myself now and would ready it as this:

    span:not(span.foo). spans INSIDE a span with the class of ‘foo’

  18. Actually, maybe not. that would be “span > span:not”, right? I think I just confused myself.

  19. Greg Lewis on said:

    Alas, it was a farce, a satire on this category of situations. Relax, untwist panties, and go back about your business. Nothing to see here folks. My apologies.

  20. @ Jason
    The argument to :not() is applied to it’s subject. A subject is a element that satisfies a selector and/or combinator[1].

    So, span:not(.foo) is all span’s, except those that has ‘foo’ as class attr. Once the ‘selection engine’ matches an element that satisfies previous simple selectors (in this case, all elements of type ‘span’), then the engine tries to match the pseudo-classes (in this case, :not()).

    Example:

    a[href].external

    A selection engine would match all elements of type ‘a’, then, to each of them, the engine would check if they have an attr named ‘href’. In this point, the subject is an element of type ‘a’, because it was matched from previous selector. Next, the engine will try to find on those ‘a’ that have an attr ‘href’ if they also are class of ‘external’. The subject here are the elements matching a[href].

    If the selector was “span:not(span.foo)”, and if that was a valid selector, then the inner span would reference the outter span itself.

    Let’s suppose you have this selector:

    :not(.important)

    This is equivalent to:

    *|*:not(.important)

    And if I have a DOM like this:

    Lorem ipsum

    Dolor sit amet. Consectuor …

    Then imagine that a selection engine would traverse this DOM this way:

    Element: .
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element:
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element:
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element:
    Matches *|*? Yes
    Matches .important? Yes
    :not(Yes) = No
    Yes and No: don’t match!

    Now imagine that the selector was the one discussed:

    span:not(span.foo)

    and suppose an element

    The selection engine would come to it and check:

    Element:
    Match span? Yes
    Match span? Yes (from span.foo)
    Match .foo? Yes (from span.foo)
    :not(Yes and Yes) = Not
    Yes and No: don’t match

    Concerning whether span:not(span.foo) makes sense or not, and according to Steven, the inner span is redundant and will match all apples that are not red because (A: Apple, R: Red):

    A & !(A & R) [ de morgan ]
    A & (!A | !R) [ distributive property ]
    ( A & !A ) | ( A & !R ) [ boundedness]
    ( false ) | ( A & !R ) [ false or something -> something ]
    ( A & !R )

    So, all Apples that are not red. Thinking a bit more about it, it is perfectly logical, because when the selection engine evaluates :not(span.foo), it already has an span element. The only trouble is to determine if it is of class ‘foo’ or not.

    http://www.w3.org/TR/2005/WD-css3-selectors-20051215/#subject

  21. (…)

    And if I have a DOM like this:

    [div]
    [div class=”title”]Lorem ipsum[/div]
    [div class=”text-body”]
    Dolor sit [span class=”important”]amet[/span]. Consectuor …
    [/div]
    [/div]

    Then imagine that a selection engine would traverse this DOM this way:

    Element: [div].
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element: [div class=”title”]
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element: [div class=”text-body”]
    Matches *|*? Yes
    Matches .important? No
    :not(No) = Yes
    Yes and Yes: match

    Element: [span class=”important”]
    Matches *|*? Yes
    Matches .important? Yes
    :not(Yes) = No
    Yes and No: don’t match!

    Now imagine that the selector was the one discussed:

    span:not(span.foo)

    and suppose an element [span class=”foo”]

    The selection engine would come to it and check:

    Element: [span class=”foo”]
    Match span? Yes
    Match span? Yes (from span.foo)
    Match .foo? Yes (from span.foo)
    :not(Yes and Yes) = Not
    Yes and No: don’t match

    (…)

  22. Andrea on said:

    I have been – happily – using jQuery since it’s early days and the only thing I’ve always had a problem with is the ‘not’. For a long time I thought that this was a bug in jQuery (also because I have the impression – correct me if I’m wrong – that early implementations worked like domQuery), and I only learned after a long time that jQuery was designed to work that way.
    To me, “span.foo” should *always* match all and only the span elements with a class foo, which is domQuery’s interpretation.
    In jQuery, this is true in queries like $(“span.foo”) but not in .not(“span.foo”). There, “span.foo” matches all the span elements PLUS all elements with a class foo.
    It’s this way of reading a CSS selector that makes it counterintuitive to me.

  23. Pingback: Interaction Design Blog » Blog Archive » DomQuery is extremely fast

  24. @Nate Cavanaugh
    comparing js library by footprint trying to say mine is smaller, so, it should be better is ridiculous and is time to say it clearly.
    if you are really concerned at those 2kb difference, compare the same sub functionality of a library and compare the compressed version, as code comments _are_ a feature.
    after doing a serious comparison, probably you will agree that footprint difference of well written code is not good reason enough to choose a library.

    thanks jack and John for the great work

  25. Micon Frink on said:

    @Marion:
    It’s not about nit-picking!!! It’s about discussing the ramifications of such comparisons… The beauty of opensource development is we learn from each other, then we grow our projects better. In the process we discuss things and learn how to attack the same problems in different ways. Jack came up with some obvious speed tweaks that I’m anxious to explore. I hope that there are verbose notes including documentation on coding choices since he speaks about doing a lot of tests to arrive at his chosen implementation. I’d love to see a report on what he tested…

    $(“apple:not(apple.red)”):
    I think that while it is redundant the behavior jQuery should be treated the apple.red like all other selectors and not fire two separate filters unless a comma is used. It seems most readers have asked for the feature change so I’m looking forward to an annoucement that we will see it in the next release.

    XPath @ inside of CSS selectors:
    This is probably a bad idea!!! Implementing the limited CSS spec doesn’t limit our coding it just forces the use of XPath for more complicated queries. This was the main purpose of the XPath spec in the first place! If we really want to get crazy we can use jQuery filter methods to do complicated selection. I bet the functional implementation is faster than complicated selection strings since parsing is not involved and you have more control over implementing the iteration logic. The jQuery team has to decide if standards compliance is important or whether supporting more ways to do things are worth bending the specs. Since I just read a decenting note from a jQuery developer about DOMQuery extending beyond the spec I can only guess that the pot calling the kettle black is in need of a good scrubbing.

    A note on speed tweaks:
    I think changing doublequoted strings to singlequoted strings may have a small speed increase. In PHP and many other languages doublequotes is parsed to look for in-string variables and escape characters. While JavaScript does not support in-string variables it does also parse the escaped characters in a double quoted string whereas a single quoted string only uses the \’ as the sole value it’s parsing. I don’t know whether we are talking nano-seconds or something more significant but this could be a speed tweak to look into.

    Keep up the good guys! Your enthusiasm is great. If I had more time I’d love to dive into the code with you. I’m too busy to commit now but if I find any notable tweaks I’ll post them to the devlist.

  26. Sebastian Redl on said:

    > A big point of this is that accounting for duplicates can be really expensive (computationally)

    It seems to me that one big reason for this is the naive implementation of merge() in jQuery. Correct me if I’m wrong, but a specialized merge() for DOM nodes (as is required by the selector engine) can be implemented in linear time instead of quadratic, as the current one is.

    The reason is simple: object identity. You’re not looking for objects that have the same value. You’re looking for the same object being referenced more than once.
    Since the object is the same, you can use a simple marker to do the merge. This would look somewhat like this:

    function mergeNodes(left, right)
    {
    var r = [];
    var fn = function(e) {
    if(!e.marked) {
    e.marked = true;
    r.push(e);
    }
    };
    jQuery.each(left, fn);
    jQuery.each(right, fn);
    // Clean up.
    jQuery.each(r, function(e) { e.marked = false; });
    return r;
    }

    Complexity is O(n+m) instead of O(n*m), with n and m being the lengths of the two arrays, obviously.

    Also, this solution also filters out duplicates in the source. You could therefore write this:

    removeDupes(ar)
    {
    var r = [];
    jQuery.each(ar, function(e) {
    if(!e.marked) {
    e.marked = true;
    r.push(e);
    }
    });
    // Clean up.
    jQuery.each(r, function(e) { e.marked = false; });
    return r;
    }

    and then, while the expression engine runs, just naively append all results to an array, and only clean it up at the very end. This might remove repeated runs over the same elements and thus improve performance even more.

    I’ll do some testing.

  27. Pingback: DomQuery - A lightweight CSS Selector / Basic XPath implementation