Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Kevin Jump 2310 posts 14695 karma points MVP 7x c-trib
    Apr 26, 2017 @ 13:58
    Kevin Jump
    0

    OK, so this is driving me slightly bonkers.

    I have a site where i need to group some things in to groups based on the first letter.

    e.g [a-e] [f-j] [k-o] etc.

    To be honest their isn't a massive amount of things so i could do this with xpath / linq but ... lets do it properly with examine.

    I've added a OnNodeGathering event to add a letter key to the index.

    if (e.Fields.ContainsKey("title")) { 
       e.Fields["letter"] = e.Fields["title"].Trim()
               .Substring(0, 1).ToLower(CultureInfo.InvariantCulture); 
    }
    

    i can see from the index that is working fine.

    and my search is basically...

    { SearchIndexType: , LuceneQuery: +(+nodeTypeAlias:mydoctype +letter:{a TO e}) }
    

    this works for everything except when the letter is a

    i've even simplified the search for testing to . remove the range and in the back office +letter:a doesn't return any entries while +letter:m for example does

    what makes a so special ! it's driving me slightly bonkers.

    changing the item that starts with a to start with b - also shows it up in the a-e section, just not when it starts with a

  • Marc Goodson 2141 posts 14344 karma points MVP 8x c-trib
    Apr 28, 2017 @ 06:25
    Marc Goodson
    1

    Hi Kevin

    Doesn't the curly brackets in the lucene syntax cause an exclusive range search, eg {a TO e} means everything between a and e (not accident and emergency) but NOT actually anything beginning with a or e ?

    aha:

    Range Queries allow one to match documents whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically.

    moddate:[20020101 TO 20030101] This will find documents whose moddate fields have values between 20020101 and 20030101, inclusive. Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:

    title:{Aida TO Carmen} This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen.

    Inclusive range queries are denoted by square brackets. Exclusive range queries are denoted by curly brackets.

    regards

    Marc

  • Kevin Jump 2310 posts 14695 karma points MVP 7x c-trib
    Apr 28, 2017 @ 07:15
    Kevin Jump
    0

    I think I tried both but I will double check, but +letter:a wasn't working either which was driving me a bit nuts...

  • Kevin Jump 2310 posts 14695 karma points MVP 7x c-trib
    Apr 28, 2017 @ 08:13
    Kevin Jump
    0

    yeah, but still no a :(

    +letter:[a TO e] is right compared to +letter:{a TO e} (so it returns things beginnging with e too when the first one doesn't)

    but it still doesn't work for a :(

    if i do a lucene search from the dashboard (tried the same in code too):

    • +letter:a returns nothing,

    change the title of the page to begin with something a b (so firing the index which changes letter value)

    • +letter:b returns that page

    very perplexing :(

  • Marc Goodson 2141 posts 14344 karma points MVP 8x c-trib
    Apr 28, 2017 @ 10:04
    Marc Goodson
    100

    is it because a is a stop word ?

    how is it for I ?

  • Kevin Jump 2310 posts 14695 karma points MVP 7x c-trib
    Apr 28, 2017 @ 11:09
    Kevin Jump
    0

    That does indeed seem to be the problem. both a and i are blank

    I can't immediately see how to turn of the stop words per query - and that's because i think they are removed by the analyser when the query goes in.

    you can alterstop words at the analazer level https://our.umbraco.org/forum/developers/extending-umbraco/25600-Examine-case-insensitive-keyword-search#comment-95343

    but to be honest i don't want to remove a and i from the stop words site wide

    I thought putting the field in with NOT_ANALYZED might work..

    e.Document.Add(new Field("_letter", letter, Field.Store.YES, Field.Index.NOT_ANALYZED));
    

    but it doesn't :(

    So it's probibly a case of putting a new index in just for the a-z and that one not having stop words

    K

    update: tell a lie +letter:i works - but i still think it's stop words

  • Kevin Jump 2310 posts 14695 karma points MVP 7x c-trib
    Apr 28, 2017 @ 11:21
    Kevin Jump
    1

    So - Ascii codes for the win.

    e.Fields["letter"] = ((int)e.Fields["title"].Trim().ToUpper(CultureInfo.InvariantCulture)[0]).ToString();
    

    then

    +letter:[65 TO 69] 
    

    works fine

    :)

  • Marc Goodson 2141 posts 14344 karma points MVP 8x c-trib
    Apr 28, 2017 @ 12:30
    Marc Goodson
    0

    Nice! Ascii to the rescue...

Please Sign in or register to post replies

Write your reply to:

Draft