Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 23, 2017 @ 07:07
    Mario Lopez
    0

    Erratic results with Lucene Spatial

    I have inherited two sites from two different clients and both seem to have the same issue. They implement a spacial search based on this article:

    https://www.leapinggorilla.com/Blog/Read/1010/spatial-search-in-lucenenet---worked-example

    The issue is that results are erratic. One result appears for a 10km distance but not for 100km for instance, which obviously is wrong.

    Some investigation took me to the point where it seems that Lucene 3.x spatial is deprecated and it doesn't work very well but Umbraco doesn't accept Lucene >= 3.0.0

    Does anyone know if this is a known issue?. Any workarounds appreciated.

    thanks.

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 23, 2017 @ 08:21
    Ismail Mayat
    0

    Mario,

    I have worked example based on this article for my examine course. One thing how are you indexing the items? Are they umbraco nodes or in the a database?

    Regards

    Ismail

  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 23, 2017 @ 23:35
    Mario Lopez
    0

    Hi Ismail,

    I put all the code that index the content together. They are Umbraco nodes:

    public class AmIndexEventHandler : ApplicationEventHandler
    {
    
    protected override void ApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
    {
         BuildGeoSpatialSearchTiers(applicationContext);
    
         BaseIndexProvider indexer = ExamineManager.Instance.IndexProviderCollection["ExternalIndexer"];
         indexer.GatheringNodeData += Provider_GatheringExternalNodeData;
    }
    

    Tiers are built here:

        private void BuildGeoSpatialSearchTiers(ApplicationContext applicationContext)
        {
            List<CartesianTierPlotter> _ctps = new List<CartesianTierPlotter>();
            IProjector _projector = new SinusoidalProjector();
    
            CartesianTierPlotter ctp = new CartesianTierPlotter(0, _projector, CartesianTierPlotter.DefaltFieldPrefix);
    
            int highestTier = ctp.BestFit(100);
            int endTier = ctp.BestFit(1);
    
            for (int i = highestTier; i <= endTier; i++)
            {
                _ctps.Add(new CartesianTierPlotter(i, _projector, CartesianTierPlotter.DefaltFieldPrefix));
            }
    
            if (applicationContext.IsConfigured && applicationContext.DatabaseContext.IsDatabaseConfigured)
            {
               //Not sure why this is done here and not in the actual ApplicationStarted, but I don't think it will make a difference.
    
                var indexer = (LuceneIndexer)ExamineManager.Instance.IndexProviderCollection[SearchService.SearcherExternalIndex];
                indexer.DocumentWriting += new EventHandler<DocumentWritingEventArgs>(Indexer_DocumentWriting);
            }
        }
    

    And the indexing is happening here:

    private void Indexer_DocumentWriting(object sender, DocumentWritingEventArgs e)
        {
    
            List<CartesianTierPlotter> _ctps = new List<CartesianTierPlotter>();
    
            if (e.Fields["nodeTypeAlias"] == Stockist.ModelTypeAlias)
            {
                var stockistContent = new UmbracoHelper(ContextHelpers.EnsureUmbracoContext()).TypedContent(e.Fields["id"]);
                if (stockistContent == null)
                    return;
                Stockist stockist = new Stockist(stockistContent); //Models builder model
    
                    if (stockist != null && stockist.AddressLookup != null)
                    {
                        var latitudeDecimal = stockist.AddressLookup.Latitude;
                        var longitudeDecimal = stockist.AddressLookup.Longitude;
                        double lat = Convert.ToDouble(latitudeDecimal);
                        double lng = Convert.ToDouble(longitudeDecimal);
    
                        //Add the longitude and latitude to the indexer
                        e.Document.Add(new Field("_lat", NumericUtils.DoubleToPrefixCoded(lat), Field.Store.YES, Field.Index.NOT_ANALYZED));
                        e.Document.Add(new Field("_long", NumericUtils.DoubleToPrefixCoded(lng), Field.Store.YES, Field.Index.NOT_ANALYZED));
    
                        //Loop through each of our tiers
                        for (int i = 0; i < _ctps.Count; i++)
                        {
                            CartesianTierPlotter ctp = _ctps[i];
    
                            var boxId = ctp.GetTierBoxId(lat, lng);
    
                            //Add the tier data to the indexer
                            e.Document.Add(new Field(ctp.GetTierFieldName(), NumericUtils.DoubleToPrefixCoded(boxId), Field.Store.YES, Field.Index.NOT_ANALYZED_NO_NORMS));
                        }
                 }
            }
        }
    

    It's seems that's all the code involved in the indexing.

    Thanks.

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 24, 2017 @ 08:17
    Ismail Mayat
    0

    Mario,

    The locations that do not work are they newish locations? Were these ones added after the main index build?

    I found with my setup if I added new nodes they would not appear however doing an index rebuild they would then appear.

    Also with my implementation for the values:

     int highestTier = ctp.BestFit(100);
        int endTier = ctp.BestFit(1);
    

    I have the values from the original post so in my case:

            public const double KmsToMiles = 0.621371192;
           public const double MaxKms = 5000 * KmsToMiles;
           public const double MinKms = 1 * KmsToMiles;
    

    and

    _startTier = plotter.BestFit(Constants.MaxKms);
    _endTier = plotter.BestFit(Constants.MinKms);
    

    Try the index rebuild if that dont work try updating the start and end tier values then restart then rebuild index see if that works.

    Regards

    Ismail

  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 25, 2017 @ 02:32
    Mario Lopez
    0

    Thanks Ismail,

    I modified the code so it uses the miles conversion, I guess there was a unit error with that as it was supposed to be kms not miles.

    Regardless that change, I have created a new node and it wasn't properly indexed, so it was appearing on 10km search but not on 100km. There must be definitely something wrong on Lucene Spatial (?).

    Unfortunately reindexing each time a new node is entered is not an option as this client has 17k nodes to index, which I tried to do in my local and it took 6 hours.

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 25, 2017 @ 08:01
    Ismail Mayat
    0

    Mario,

    What happened after you rebuilt index locally? Did it appear correctly? What about just republishing that one?

    Regards

    Ismail

  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 29, 2017 @ 06:47
    Mario Lopez
    100

    So after building and rebuilding... it turned out that the issue was in the search and not in the indexing.

    So the search was doing this:

    TopDocs resultDocs = searcher.Search(masterQuery, null, 50);
    

    And the results are not returned ordered by distance, that is done afterwards with just:

    //distanceResults just a list of IPublishedContent
    distanceResults.OrderBy(x => x.Distance).ToList();
    

    So some of the results where let out.

    To fix the issue I just changed the doc limit to 500 but that can cause same issue in the future. So the question now would be if there is a way of returning the top 50 results ordered by distance.

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 29, 2017 @ 10:56
    Ismail Mayat
    0

    Mario,

    Distance is calculated so you cannot sort on until its calculated.

    I am just looking at my code looks like im pulling back 100. I will have a play tommorow on this as i will have same issue as you.

    Regards

    Ismail

  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 30, 2017 @ 00:04
    Mario Lopez
    0

    I found this article:

    So they create a sorting filter. This is adapted to my implementation:

     var distanceFilter = new LatLongDistanceFilter(boundingArea,  radius, Lat, Long,  "_lat", "_long");
    
     DistanceFieldComparatorSource dsort = new DistanceFieldComparatorSource(distanceFilter);
    
     //it seems that the 'geo_distance' field is created by the distance filter so there is not need to index it.
     Sort sort = new Sort(new SortField("geo_distance", dsort));
    
     //and the sort filter is passed to the query. These results are already sorted by distance.
     TopDocs resultDocs = searcher.Search(masterQuery, null, 100, sort);
    

    The way they do it in the example is a bit different:

    //Instead creating a filter they create a distance query
    var distanceQuery = new DistanceQueryBuilder(Lat, Long, Convert.ToDouble(radius), "_lat", "_long", CartesianTierPlotter.DefaltFieldPrefix, true).Filter;
    
    DistanceFieldComparatorSource dsort = new DistanceFieldComparatorSource(dq.getDistanceFilter());
    Sort sort = new Sort(new SortField("geo_distance", dsort));
    
    Query query = new MatchAllDocsQuery();
    
     // then they access the created filter from the query with getFilter()
    TopDocs hits = searcher.search(query, dq.getFilter(), 20, sort);
    

    Thanks for your help with this Ismail, you pointed me to the right direction :).

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 30, 2017 @ 07:56
    Ismail Mayat
    0

    Mario,

    So now that you are adding the sort when doing the search did you get rid of the linq sort?

    Regards

    Ismail

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    May 30, 2017 @ 08:23
    Ismail Mayat
    0

    Mario,

    Excellent find I have just used this update on my examine course spatial example, I know have lucene sort rather than nasty linq sort on distance.

    Regards

    Ismail

  • Mario Lopez 168 posts 952 karma points MVP 3x c-trib
    May 30, 2017 @ 08:26
    Mario Lopez
    0

    Cool, yep I removed the linq sort too.

Please Sign in or register to post replies

Write your reply to:

Draft