15 December 2010

Relational Data 1 - Indexing

All though a bit late (been quite busy), here is the first post about relational data.
As I described you might want to do some relational operations on data structured as a hierarchy in Sitecore. The example was news which might need to be sorted and listed on the frontpage or similar, is the one I am going to address first.

Most of the times when I stumble upon issues like this, I solve it with indexing. That is building a relational data set from the hierarchy. When working for clients, most of the times I have solved this issue using Ankiro. However Ankiro cost money, so I thought I would try it out with Lucene. Below you can see an example of how to do this.

Please note that by default Lucene doesn’t index on publish, but by a scheduled task that as default runs every five minute. However it is possible to customize Lucene to index on the publish event.

Well here it the technical description:
First of all you need to create the index containing the data you want to use as relational data. This is done pretty easy in web.config. Further reading is available on SDN (
http://sdn5.sitecore.net/FAQ/API/Indexing%20a%20Database%20with%20Lucene,-d-,NET.aspx)
Basically you define the index in the <indexes> element :



   1:  <index id="newsIndex" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
   2:    <param desc="name">$(id)</param>
   3:  <index id="newsIndex" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
   4:    <param desc="name">$(id)</param>
   5:    <templates hint="list:AddTemplate">
   6:      <!-- Enter the id of your news template -->
   7:      <template>{96479B71-2C0B-4729-8301-080145F28FA6}</template>
   8:    </templates>
   9:    <fields hint="raw:AddField">
  10:      <!-- add the fields you want in your index -->
  11:      <field target="created">__created</field>
  12:      <field target="updated">__updated</field>
  13:      <field target="author">__updated by</field>
  14:      <field target="published">__published</field>
  15:      <field target="name">@name</field>
  16:      <field target="id">@id</field>
  17:      <field target="Title">Title</field>
  18:      <field target="Abstract">Abstract</field>
  19:      <field target="Release date">Release date</field>
  20:    </fields>
  21:  </index>

This defines the index. Now add it to the web database:



   1:  <database id="web" singleInstance="true" type="Sitecore.Data.Database, Sitecore.Kernel">
   2:  ...
   3:    <indexes hint="list:AddIndex">
   4:      <index path="indexes/index[@id='newsIndex']"/>
   5:    </indexes>


You can now rebuild indexes for the web database through the Control Panel in Sitecore.

Now that you have the index, you can do you relational operations. For instance you might want to sort by date. Here is some sample code to retrieve an arbitrary number of news items from the index sorted by the “Release date” field. (I don’t like to use the system fields __updated or __created as these might change when you upgrade or similar).





public Document[] GetLatestNews(int numberOfNews)

{

  //Get the Lucene IndexSearcher

  Index newsIndex = Sitecore.Configuration.Factory.GetIndex("newsIndex");

  Database web = Sitecore.Configuration.Factory.GetDatabase("web");

  IndexSearcher searcher = newsIndex.GetSearcher(web);

 

  //I want to sort by the release date

  Sort sort = new Sort("Release date",true);

 

  //I want all documents that has the field Release date

  Query searchingxml = new WildcardQuery(new Term("Release date", "*"));

 

  //Get all the hits sorted

  Hits hits = searcher.Search(searchingxml, sort);

 

  //How many documents should be returned

  int noOfNews = hits.Length() > numberOfNews ? numberOfNews : hits.Length();

  Document[] docs = new Document[noOfNews];

 

  //Iterate over the hits and return the number of news I want as documents

  for (int i = 0; i < noOfNews; i++)

  {

    docs[i] = hits.Doc(i);

  }

  return docs; 


This returns the latest news, which you can list in your spot or whatever you want to do.

The advantages of using index’ is:
You got the advantages of a hierarchy in the backend.
The end user doesn’t feel or see any other technology then Sitecore.
It is simple and easy to implement and maintain.

The disadvantages of using index’ is:
It is replicated data. This takes up more space and I often find problems with inconsistency between index’ and Sitecore data.
You got the disadvantages of a hierarchy in the backend.



No comments:

Post a Comment