Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Dan 1211 posts 3548 karma points admin c-trib
    7 days ago
    Dan
    0

    Auto index Examine fields from custom data source, using Lucene spatial search

    Hi,

    I'm implementing a 'nearest neighbour' facility using Lucene spatial search. I have a custom database table containing properties (as in, physical properties like hotels rather than Umbraco properties) with name, description, latitude and longitude fields which need to be injected into the Examine index along with the Cartesian Tier fields from the Lucene spatial utility. Note, only data from the custom table needs to be indexed, I'm not mixing in any Umbraco content. The code which does the indexing is as follows:

    public class PropertyIndexDataService : ISimpleDataService
    {
        public const double KmsToMiles = 0.621371192;
        public const double MaxKms = 5000 * KmsToMiles;
        public const double MinKms = 1 * KmsToMiles;
    
        private readonly List<CartesianTierPlotter> _ctps = new List<CartesianTierPlotter>();
        private readonly IProjector _projector = new SinusoidalProjector();
    
        public PropertyIndexDataService() { }
    
        public IEnumerable<SimpleDataSet> GetAllData(string indexType)
        {
            CartesianTierPlotter ctp = new CartesianTierPlotter(0, _projector, CartesianTierPlotter.DefaltFieldPrefix);
    
            //The starting tier (the largest grid square) calculated by providing the furthest distance in miles that we want to search
            int startTier = ctp.BestFit(MaxKms);
    
            //The last tier (the smallest grid square) calculated by providing the closest distance in miles that we want to search
            int endTier = ctp.BestFit(MinKms);
    
            for (int i = startTier; i <= endTier; i++)
            {
                _ctps.Add(new CartesianTierPlotter(i, _projector, CartesianTierPlotter.DefaltFieldPrefix));
            }
    
            List<SimpleDataSet> data = new List<SimpleDataSet>();
    
            var db = ApplicationContext.Current.DatabaseContext.Database;
    
            List<Property> Properties = db.Fetch<Property>("SELECT * FROM Property");
    
            foreach (Property property in Properties)
            {
                var rowData = new Dictionary<string, string>();
                rowData.Add("name", property.name);
                rowData.Add("description", property.description);
    
                for (int i = 0; i < _ctps.Count; i++)
                {
                    CartesianTierPlotter plotter = _ctps[i];
    
                    //Calculate this tiers grid from the properties location
                    var boxId = plotter.GetTierBoxId(property.latitude, property.longitude);
    
                    //Add the tier data to the indexer
                    rowData.Add(plotter.GetTierFieldName(), NumericUtils.DoubleToPrefixCoded(boxId));
                }
    
                data.Add(new SimpleDataSet()
                {
                    NodeDefinition = new IndexedNode()
                    {
                        NodeId = property.id,
                        Type = "CustomData"
                    },
                    RowData = rowData
                });
            }
            return data;
        }
    }
    

    It seems like the correct values are being generated but nothing shows up in the Examine index unless I manually add all fields into IndexUserFields in the ExamineIndex.config. I was under the impression that if no IndexUserFields were specified, it would index everything.

    So the following yields no data in the index:

      <IndexSet SetName="PropertyIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PropertyIndexSet" />
    

    ... but the following results in a fully populated index:

    <IndexSet SetName="PropertyIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/PropertyIndexSet">
      <IndexUserFields>
        <add Name="name"/>
        <add Name="description"/>
        <add Name="_tier_3"/>
        <add Name="_tier_4"/>
        <add Name="_tier_5"/>
        <add Name="_tier_6"/>
        <add Name="_tier_7"/>
        <add Name="_tier_8"/>
        <add Name="_tier_9"/>
        <add Name="_tier_10"/>
        <add Name="_tier_11"/>
        <add Name="_tier_12"/>
        <add Name="_tier_13"/>
        <add Name="_tier_14"/>
        <add Name="_tier_15"/>
      </IndexUserFields>
    </IndexSet>
    

    I don't fully understand how the Cartesian Plotters work so I'm not particularly comfortable in adding the tier fields manually in the config. I'd rather it just index everything without any configuration.

    All of the examples I can find which cover this topic (e.g. this blog article) suggest using the low level Lucene document writing events to populate the index, but they all involve indexing Umbraco content which is a different kind of implementation to indexing custom content - I can't reconcile the two approaches.

    Could anyone suggest how I can adapt my logic to write all data to the index without requiring manual configuration?

    Many thanks.

  • Ismail Mayat 3991 posts 8154 karma points admin mvp c-trib
    5 days ago
    Ismail Mayat
    100

    Dan,

    When building custom indexer you have to ensure fields are added to the config under IndexUserFields else they will not be in the index. I have this working on sample that I run through on the examine course and this is for custom db table which has longitude and latitude my config looks like:

        <IndexUserFields>
      <add Name="name" EnableSorting="true"/>
      <add Name="county" EnableSorting="true"/>
      <add Name="country"/>
      <add Name="grid_reference" />
      <add Name="latitude" />
      <add Name="longitude"/>
      <add Name="postcode_sector"/>
    </IndexUserFields>
    

    I do not add the tiers fields in config and it works. Now the difference between mine and yours is you are doing the addition of tiers in the indexer i add mine using document writing event see this gist https://gist.github.com/ismailmayat/3902c660527c8b3d20b38ae724ab9892

  • Dan 1211 posts 3548 karma points admin c-trib
    5 days ago
    Dan
    1

    That has really connected the dots in my understanding and seems to be working nicely, thanks Ismail!

  • Ismail Mayat 3991 posts 8154 karma points admin mvp c-trib
    5 days ago
    Ismail Mayat
    0

    Just one thing how often does the data change? When you add new data or update you will need to rebuild index, unless you handle at point of change and add update individual row

Please Sign in or register to post replies

Write your reply to:

Draft