Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Morten Kjerulff 7 posts 27 karma points
    Mar 09, 2010 @ 15:15
    Morten Kjerulff
    0

    Umbraco Examine Search Setup

    Hi.

    I am trying to set up search using Examine in Umbraco 4.0.3 as instructed on http://farmcode.org/?tag=/umbraco+examine

    However, it seems that the index is not updated when i publish content. The index file "segments" is only modified when i specify the indexpath in the webconfig.

    There is another index file in the data/_systemUmbracoIndexDontDelete directory, which is modified when I publish new content.

    Am I using some old documentation, which is invalid?

    Cheers
    Morten

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Mar 09, 2010 @ 22:51
    Aaron Powell
    0

    Which release of Examine are you using?

    Umbraco 4.0.x had its own Lucene.NET indexer in it, which is what generates the /data/_system... folder.

    Can you post your Examine config

  • Morten Kjerulff 7 posts 27 karma points
    Mar 10, 2010 @ 09:33
    Morten Kjerulff
    0

    Thanks for the quick reply :-)

    I am using version 4 (UmracoExaminev4).

    The configuration is as follows:

    <configuration> 
        <configSections>
         .
         .

        <section name="UmbLuceneIndex" type="TheFarm.Umbraco.Lucene.Common.Configuration.IndexSets, TheFarm.Umbraco.Lucene.Common" />

       </configSections>

        <!-- DefaultIndexSet,EnableDefaultActionHandler:  REQUIRED -->
        <UmbLuceneIndex DefaultIndexSet="forside" EnableDefaultActionHandler="true">

            <!-- REQUIRED: MaxResults,IndexPath,SetName -->
            <!-- NOT Required: IndexParentId. If not specified then then all documents are indexed, otherwise only documents as children of the id are indexed -->
            <IndexSet SetName="forside" IndexPath="~/data/UmbracoExamine/" MaxResults="100">
                <IndexUmbracoFields>
                    <add Name="id" />            <!-- REQUIRED -->
                    <add Name="nodeName" />      <!-- REQUIRED -->
                    <add Name="updateDate" />
                    <add Name="writerName" />
                    <add Name="path" />
                    <add Name="nodeTypeAlias" /> <!-- REQUIRED -->
                </IndexUmbracoFields>
                <!-- The User defined fields to be indexed and searched. The UmbracoIndexer has methods to override the fields to be searched. -->
                <IndexUserFields>
                    <add Name="pageTitle"/>
                    <add Name="bodyText"/>
                </IndexUserFields>
                <!-- IncludeNodeTypes not required. If not specified, the indexer will index ALL document types-->
                <IncludeNodeTypes>
                    <add Name="Forside"/>
                </IncludeNodeTypes>
               
                <!-- ExcludeNodeTypes not required. If specified, these node types will not be indexed. -->
                <ExcludeNodeTypes/>
            </IndexSet>

        </UmbLuceneIndex>

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Mar 10, 2010 @ 09:45
    Aaron Powell
    0

    You're using a very old version of Examine, we don't support that one any more.

    Grab the latest from codeplex. It's got examples in the source of how you can use it - http://umbracoexamine.codeplex.com/

  • Morten Kjerulff 7 posts 27 karma points
    Mar 10, 2010 @ 10:05
    Morten Kjerulff
    0

    Thanks again.

    It seems I am not able to find a newer version than the one from july 2 2009?

  • Morten Kjerulff 7 posts 27 karma points
    Mar 10, 2010 @ 10:15
    Morten Kjerulff
    0

    Ok, I think I found it under "Source Code". Will try it out...

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Mar 10, 2010 @ 10:16
    Aaron Powell
    0

    Shan hasn't put any new releases up on codeplex to download for quite some time. You're best off downloading the source and compiling it yourself.

    It also gives you the demos to play with that way ;).

    Out-of-band releases are low on the priority list recently as we're including it as part of 4.1 so it keeps being dropped with that.

  • Nalle Jacobsson 32 posts 55 karma points
    Mar 11, 2010 @ 17:02
    Nalle Jacobsson
    0

    I downloaded the latest source, and I have got it up and running..almost. Two problems:

    1. The index doesn't update on publish (enableDefaultEventHandler is set to true), it only works if I manually call ExamineManager.Instance.RebuildIndex();

    2. As far as I can tell it completely ignores the custom properties that I specify in IndexUserFields, I only get hits on the standard Umbraco fields, like nodeName.

    I would be extremely happy for some guidance :)

    Thanks

    /N

  • Morten Kjerulff 7 posts 27 karma points
    Mar 11, 2010 @ 20:43
    Morten Kjerulff
    0

    Hi Nalle.

    My indexes are updated when I publish content.
    My config-file looks like the one provided with the source in the test-project. However, I noticed at one point, that the indexes were not updated after I had been inactive for some time. I then logged out and back in as administrator, and voila, the indexes were updated again.

    However any search I make, returns 0 results, even if I search for the standard Umbraco Fields.
    My search is done with the command:

    var results = ExamineManager.Instance.Search("query", 100, true);

    Does anyone know if there is a way to see what is actually indexed?

    Cheers
    Morten

  • Nalle Jacobsson 32 posts 55 karma points
    Mar 11, 2010 @ 21:46
    Nalle Jacobsson
    0

    Hi Morten!

    I have some answers :)

    The problem with custom properties not being indexed turned out to be a bug (Umbraco versions below 4.1). I managed to find it and created a patch for it on Codeplex. I actually found another bug also that I created a patch for. Head over and download :)

    Logging in and out as administrator solved my problem with the indexes not updating, I have no idea why that does the trick, but thanks :)

    And yes, there is a way to see whats indexed, you can do it with Luke (java program)

    http://www.getopt.org/luke/

    Just run the webstart version and point it to your index folder

    Good luck :)

    /N

  • Morten Kjerulff 7 posts 27 karma points
    Mar 11, 2010 @ 21:50
    Morten Kjerulff
    0

    Hi Nalle!

    Thanks alot :-)
    Will get the patch right away...

    Cheers
    Morten

  • Shannon Deminick 1526 posts 5272 karma points MVP 2x
    Mar 11, 2010 @ 23:01
    Shannon Deminick
    1

    Yes Luke is a great tool.

    Make sure that you use the very latest version on CodePlex from the source code tab. I'll post up a new release shortly with documentation.

    Once you update to the new version and follow the config setup found in the demo/test project, you'll want to do a couple things to ensure that the index is re-created:

    Depending on your examine data path, by default this is App_Data/ExamineIndexes , you'll want to delete this folder completely. Examine will recreate all necessary folders for you. Once your config is re-setup, just publish a node and it will rebuild the entire thing.

    There IS a bug found in 4.0.3 and below which doesn't always instantiate event listeners (IApplications). This is fixed in 4.1, and i'm sure an update will be released for pre 4.1 versions to fix this. If your affected by the bug, one way to solve it is to remove the Examine DLL from the bin folder, visit a page on your site (this restarts the app pool), then copy the dll back in a visit a page. The bug is due to umbraco trying to find IApplications that are already loaded into it's app pool when it should be looking for all IApplications in all of the DLLs. When you move the DLL out and then back in, umbraco will load that IApplication into its app pool and wire up all of the examine events. This is a weird bug and doesn't happen all of the time. It will also affect any other packages or custom code that use IApplication.

    hope this helps.

     

  • Nalle Jacobsson 32 posts 55 karma points
    Mar 12, 2010 @ 00:06
    Nalle Jacobsson
    0

    Thanks for the info Shannon!

    I didn't know about the bug with event listeners in 4.0.3, must have missed that! It explains the problems I have been having with events in a couple of other packages. Is there a patch for this?

    Indexing works great now. I'm working on a site with 15.000 documents and growing, due for release in 2 weeks. Can Examine handle that (with the proper hardware of course) ?

    Cheers,

    /N

  • Shannon Deminick 1526 posts 5272 karma points MVP 2x
    Mar 15, 2010 @ 11:39
    Shannon Deminick
    0

    Examine should be able to handle TONS of information. Underlying it's Lucene and it's very powerful. Have a look at some Lucene benchmarks, there are tons around (lots are based on the Java version, but shouldn't make any difference.. ).

    The only thing to note with performance is the optimization, which in the latest version runs every 100 commits and on app pool startup. I would have to beleive that if your index is really big, optimization might become a bit slower (i'm not 100% positive though). Older version of examine don't deal with optimization very well (tries to optimize on every commit) so please make sure you use the latest checkin. I'll hopefully have a release and documentation up in a week or so (just have to get the rest of my umbraco 4.1 things done too). From memory, i think i made the optimize threshold (100), configurable, but this will all be in the docs in due time.

    I'm glad its working well for you and please keep the feedback coming in.

     

  • Davy Meybos 11 posts 96 karma points
    Mar 31, 2010 @ 08:02
    Davy Meybos
    0

    Hi Nalle,

    can you tell me where to find the umbraco-lucene custom properties patch?

     

    Thanks!

  • Neil Tootell 73 posts 118 karma points
    May 13, 2010 @ 13:11
    Neil Tootell
    0

    Hi there, there doesn't seem to be a specific forum to put this, so here goes.

    I'm trying to restrict my Examine searches to a specific nodeTypeAlias and not having much luck. Here is my code...

    ****************

    string[] searchFields = { "nodeName", "TitleText", "Summary", "BodyText", "MetaKeywords", "MetaDescription" };

                string restrictedNodeTypeAlias = "SMPublication";

                var examineQuery = sc

                    .NodeTypeAlias(restrictedNodeTypeAlias.MultipleCharacterWildcard())

                    .And()

                    .GroupedOr(searchFields, queryText)

                    .Compile();

                var searchResults = ExamineManager.Instance.SearchProviderCollection["SiteIndex"].Search(examineQuery);

     

    ****************

    I'm using Examine release 52101 on 4.0.3 of umbraco. I've looked at the testing examples and as far as I can see I've put in the correct syntax - anyone spot if I've done something stupid here??

    Many thanks

    Neil

  • Aaron Powell 1708 posts 3046 karma points c-trib
    May 14, 2010 @ 00:49
    Aaron Powell
    0

    Are you getting too many results back or miss-matching results or what?

    Can you post the Lucene search query generated (if you do searchCriteria.ToString() you'll see it, it also appears in the debugger).

  • Nalle Jacobsson 32 posts 55 karma points
    May 22, 2010 @ 13:22
    Nalle Jacobsson
    0

    Upgraded to the lastest checkin and got everything working, nice! Can't find the max record count parameter anymore though, It's still in the docs, but not in the code?

    /N

  • Aaron Powell 1708 posts 3046 karma points c-trib
    May 23, 2010 @ 04:52
    Aaron Powell
    0

    The max results is no longer valid (it didn't do anything for the last few builds). Due to the internal designs of Examine it didn't do anything, so we've dropped it.

  • Nalle Jacobsson 32 posts 55 karma points
    May 25, 2010 @ 16:22
    Nalle Jacobsson
    0

    Alright, all good then :)

    Happy to report Umbraco and Examine is now powering a major swedish site: http://stureplan.se

    After a lot of hard work we finally switched from the old platform yesterday, so far so good!

    /N

  • Aaron Powell 1708 posts 3046 karma points c-trib
    May 26, 2010 @ 00:56
    Aaron Powell
    0

    Awesome to know :D

  • Marc Aarts 14 posts 35 karma points
    May 26, 2010 @ 20:17
    Marc Aarts
    0

    I've been following Examine and trying to understand how to use it in Umbraco 4.0.x and playing with the tests in the sourcecode package.

    Nalle, when I search on something like 'djungelfeber' on your site, I would expect at least one result. Is it not updated regularly?

  • Nalle Jacobsson 32 posts 55 karma points
    Jun 01, 2010 @ 00:12
    Nalle Jacobsson
    0

    Thanks for pointing this out Marc. Seems like indexing is not working properly at the moment. On save/publish we now get this in the log:

     

    Error indexing node: Lucene.Net.Store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@D:\web\applications\stureplan.se\wwwroot1\data\_systemUmbracoIndexDontDelete\write.lock
       at Lucene.Net.Store.Lock.Obtain(Int64 lockWaitTimeout)
       at Lucene.Net.Index.IndexWriter.Init(Directory d, Analyzer a, Boolean create, Boolean closeDir, IndexDeletionPolicy deletionPolicy, Boolean autoCommit, Int32 maxFieldLength, IndexingChain indexingChain, IndexCommit commit)
       at Lucene.Net.Index.IndexWriter..ctor(String path, Analyzer a, Boolean create)
       at umbraco.cms.businesslogic.index.Indexer.ContentIndex(Boolean ForceRecreation)
       at umbraco.cms.businesslogic.index.Indexer.IndexNode(Guid ObjectType, Int32 Id, String Text, String UserName, DateTime CreateDate, Hashtable Fields, Boolean Optimize)
       at umbraco.cms.businesslogic.web.Document.Index(Boolean Optimze)


    Any ideas?

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Jun 01, 2010 @ 15:31
    Aaron Powell
    0

    That's not to do with Examine, that's to do with the indexer which is built into the Umbraco 4.0 source code.

    Easiest way I've found to solve that problem is to add a handler to the before indexing event (I think it's on content, I don't have umbraco open so I can't be 100% sure) and then cancel that event.

    We've had to use it in the past when dealing with lots of dynamic node creation.

  • Martin Griffiths 826 posts 1269 karma points c-trib
    Sep 06, 2010 @ 11:56
    Martin Griffiths
    0

    Was there ever a patch of sorts available anywhere to get Examine working in Umbraco 4.0.x with custom indexes?

    As soon as we switch from classic pipeline mode to integrated mode on our Win2008 box the events fail to complete. The folders are built and the queue files appear but then it doesnt go any further.

    This is becoming quite a desperate situation for us in our current build.

    Martin

     

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Sep 06, 2010 @ 13:48
    Aaron Powell
    1

    Try changing the config to runAsync="false" on the indexer. That'll then log to the Umbraco log table and be able to see what the problem is.

    But we're running Examine on a few 4.0 sites with no problem

  • Martin Griffiths 826 posts 1269 karma points c-trib
    Sep 06, 2010 @ 14:51
    Martin Griffiths
    0

    Hi slace

    Can you confirm what server and what app pool you run your sites under?

    The problem only seems to happen in integrated mode, classic is fine.

    We would prefer to use integrated mode because it's recommmended by Umbraco and there's all sorts of functionality which only works in this mode.

    Martin

  • Aaron Powell 1708 posts 3046 karma points c-trib
    Sep 06, 2010 @ 15:13
    Aaron Powell
    0

    The majority of our servers are IIS6 still, and the ones I can think of that use Examine in 4.0 are on IIS6.

    I've never seen sites running into problems running in Classic mode on IIS7 though, I believe that the sites we have on IIS7 are a mixture of classic and integrated app pools.

    The problem that Nalle has mentioned isn't related to Examine though, that was from the old Lucene.Net implementation.

Please Sign in or register to post replies

Write your reply to:

Draft