Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • AB 29 posts 80 karma points
    Jun 11, 2014 @ 10:10
    AB
    1

    Examine pdf search

    Hi All,

    I am putting together a query to return pdfs by name.  I followed many tutorials on the subject but I can never seem to return any pdf results.  The index looks ok and is present.

    Here is my code:

    var searcherPdf = ExamineManager.Instance.SearchProviderCollection["PDFSearcher"];
                    var searchCriteriaPdf = searcherPdf.CreateSearchCriteria();
                    var queryPdf = searchCriteriaPdf.Field("name", Request.QueryString["q"]).Or().Field("nodeName", Request.QueryString["q"]).Compile();
                    var searchResultsPdf = searcherPdf.Search(queryPdf);

    Are there any obvious mistakes? Can anyone point me to a good example for a config that is know to work?

    Thanks

    Adam

  • Dave Woestenborghs 3504 posts 12133 karma points MVP 8x admin c-trib
    Jun 11, 2014 @ 12:32
    Dave Woestenborghs
    1

    Are the fields name and nodename present in your pdf index ?

     

    Can you post your examine config files ?

     

    Dave

  • AB 29 posts 80 karma points
    Jun 12, 2014 @ 10:01
    AB
    1

    Hi Dave,

     

    Here are the config files:

    <ExamineLuceneIndexSets>
      <!-- The internal index set used by Umbraco back-office - DO NOT REMOVE -->
      <IndexSet SetName="InternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/Internal/"/>
    
      <!-- The internal index set used by Umbraco back-office for indexing members - DO NOT REMOVE -->
      <IndexSet SetName="InternalMemberIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/InternalMember/">
        <IndexAttributeFields>
          <add Name="id" />
          <add Name="nodeName"/>
          <add Name="updateDate" />
          <add Name="writerName" />
          <add Name="loginName" />
          <add Name="email" />
          <add Name="nodeTypeAlias" />
        </IndexAttributeFields>
      </IndexSet>
        
      <IndexSet SetName="PDFIndexSet" IndexPath="~/App_Data/ExamineIndexes/PDFIndexSet/" />
      <!-- Default Indexset for external searches, this indexes all fields on all types of nodes-->
      <IndexSet SetName="ExternalIndexSet" IndexPath="~/App_Data/TEMP/ExamineIndexes/External/" />
      
    </ExamineLuceneIndexSets>

     <ExamineIndexProviders>
        <providers>
          <add name="InternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
               supportUnpublished="true"
               supportProtected="true"
               analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>
    
          <add name="InternalMemberIndexer" type="UmbracoExamine.UmbracoMemberIndexer, UmbracoExamine"
               supportUnpublished="true"
               supportProtected="true"
               analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>
    
            <!-- default external indexer, which excludes protected and unpublished pages-->
            <add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"/>
    
          <add name="PDFIndexer" type="UmbracoExamine.PDF.PDFIndexer, UmbracoExamine.PDF"
             umbracoFileProperty="umbracoFile" />
          <!--<add name="PDFIndexer" type="UmbracoExamine.PDF.PDFIndexer, UmbracoExamine.PDF"/>-->
    
          <!--<add name="ContentSearchIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine" />-->
          
        </providers>
      </ExamineIndexProviders>
    
      <ExamineSearchProviders defaultProvider="ExternalSearcher">
        <providers>
          <add name="InternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
               analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>
            
          <add name="ExternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" />
          
          <add name="InternalMemberSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
               analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" enableLeadingWildcard="true"/>
    
          <add name="PDFSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" />
          
        </providers>
      </ExamineSearchProviders>

    Thanks in advance,

    Adam

     

     

  • Dave Woestenborghs 3504 posts 12133 karma points MVP 8x admin c-trib
    Jun 12, 2014 @ 10:31
    Dave Woestenborghs
    1

    Config seems okay.

    According to this post the examine PDF index contains 2 fields that can be returned : http://our.umbraco.org/forum/developers/api-questions/35922-Problems-with-PDFIndexer-PDFSearcher

    • FileTextContent
    • __NodeId (yes, that's two underscores)
    So you need to update your search query.
    Dave
  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    Jun 12, 2014 @ 10:52
    Ismail Mayat
    1

    Adam,

    Did you look at the index with luke or examine inspector just to see you have data in the index?

    Regards

    Ismail

  • Matt Taylor 873 posts 2086 karma points
    Sep 08, 2015 @ 08:22
    Matt Taylor
    1

    Hello all,

    Is PDF searching in the core of Umbraco 7?

    I've not seen a great deal about how to use it and seen many OLD posts about people doing it with 3rd party packages so I'm just looking for confirmation before I start trying to make it work, probably by copying the code above.

    Kind regards,

    Matt

  • Ismail Mayat 4511 posts 10090 karma points MVP 2x admin c-trib
    Sep 08, 2015 @ 08:51
    Ismail Mayat
    2

    Examine does have pdf indexer and searcher, although you have to add to the config to set it up if you look at above posts with configs you will see how to do it. The other option which is not out of the box is the cogmedia indexer i wrote many moons ago that was more for indexing pdf and other stuff like word / excel to name a few.

    Regards

    Ismail

  • Matt Taylor 873 posts 2086 karma points
    Sep 08, 2015 @ 09:20
    Matt Taylor
    1

    Thanks Ismail,

    Yes I've used the cogworks package for PDF indexing in the past. It was good. :-)

    This post led me to believe it was finally working in the core but I wasn't clear. I tried using it in the core many moons ago but go nowhere, hence the use of your package.

    I'll give it another bash now, on your say so. ;-)

    Kind regards,

    Matt

Please Sign in or register to post replies

Write your reply to:

Draft