Adis Jugo

The Southern Side – SharePoint thoughts and bytes

Including duplicates in SharePoint search results

If you have two very similar files, it can happen that the SharePoint search API recognize them  as duplicate files.

This all happens because of the the iFilter architecture on which the SharePoint Search relays on: from each file is first the pure text being extracted, and then, this text is indexed.

Now, if two texts are very similar – SharePoint is going to consider them as duplicates. So, it can happen that you have a PowerPoint presentation and a Word document with a very similar content – only one is going to be found. The other is considered as duplicate…

Well, in the "Search results" webpart in the search center site you can unselect the "Remove duplicate results" checkbox.

duplicateresults_78EC2262

In the code, when using SharePoint Search API, you have to set the "TrimDuplicates" property of your Query object to false:

//
// create a new FullTextSqlQuery class
FullTextSqlQuery myQuery = new FullTextSqlQuery(m_SharePointSite);
//
//...
//
//search results
myQuery.ResultTypes = ResultType.RelevantResults;
myQuery.TrimDuplicates = false;
//
// execute the query and load the results into a datatable
ResultTableCollection queryResults = myQuery.Execute();

Please pay attention on the underlined source code line…

And that would be all…

Wed, February 11 2009 » Development

Share 'Including duplicates in SharePoint search results' on Facebook Share 'Including duplicates in SharePoint search results' on LinkedIn Share 'Including duplicates in SharePoint search results' on Twitter Share 'Including duplicates in SharePoint search results' on XING

2 Responses

  1. KattyBlackyard July 15 2009 @ 17:05

    The article is usefull for me. I’ll be coming back to your blog.

  2. sharepoint November 20 2009 @ 16:31

    this property doesnt seem to be working for me. my fulltextsqlquery really has only one condition – “checkoutuser = ‘lastname, firstname’”. when i run the equivalent in sharepoint search – “checkoutuser:’lastname, firstname’” it returns at least 1 doc with a ‘view duplicates’ option. i assume this is because it has 1 duplicate. but maybe i’m wrong. setting trimduplicates = true in my fulltextquery is not changing the result. i’m not seeing this duplicate document. my understanding of sharepoint is a little strained because i’m not constantly high on lsd, so maybe i’m not understanding right, but shouldnt this trimduplicates option affect whether or not i see such a ‘duplicate’ document? what actually is the effect of this property?

Leave a Reply