Fusion 3.1.0: How to Use The REST Query Index Pipeline Stage


Fusion 3.1, everybody! The following may or may not work on past, or future, versions of Fusion.

Don’t have it? Go get it! Don’t make the cat do all the work.

Question: How do I use the Fusion REST Query index pipeline stage to add additional metadata to an inbound document?

Answer: This assumes the existence of a Solr collection with metadata and that Fusion knows of its existence (that means either use the default Solr cluster that runs within Fusion or make sure that the external Solr cluster you are using is registered with Fusion).

The basic steps:

  • create a collection to store the metadata and populate it the metadata of your choice
  • create a collection which will hold new enhanced content with additional metadata from the first collection
  • configure the index pipeline of the second collection to include the REST Query stage which will make a query to the first collection and add some content to the current inbound content of the second collection

Some detail:

You are going to create 2 collections: one to store the metadata and another to store the enhanced document.

Let’s name the collection that will store the metadata metadata.

Let’s name the collection that will use the metadata to enhance the content enhanced-content.

Index the following CSV into collection metadata (call the file metadata.csv and index it using the Local File System connector and a parser pipeline with just the CSV Parser):

name_s,rank_s,serial_number_s
Abe,123,SN1111111
Bob,456,SN2222222
Chloe,789,SN3333333

Save this CSV as enhanced.csv:

title,author,publisher
1984,Chloe,Random House

Go to collection enhanced-content. Create a Local File System datasource (call it anything you like. How about enhanced-ds?) and configure it to index the enhanced.csv file. Don’t index it yet (or index it to make sure it goes into the collection. After that Clear Datasource).

Open the index pipeline for enhanced-content (which should be enhanced-content-default).

Add the REST Query pipeline stage. Configure it as follows:

Endpoint URI: solr://metadata/select

Call method: get

Query parameters:

Property Name: q

Property Value: name_s:${author}

Property Name: fl

Property Value: name_s,rank_s

Mapping of Returned Values (as XPath Expressions) to Document Fields

XPath Expression//result/doc/str[@name=’name_s’]/text()

Target Field: name_s

XPath Expression: //result/doc/str[@name=’rank_s’]/text()

Target Field: rank_s

Your configuration should look more or less like this:

Save the REST Query stage and make sure it is before the Solr Indexer stage.

Save the enhanced-ds datasource.

What should happen when you start the enhanced-ds datasource:

  • it will read the enhanced.csv file
  • read the first (and only) document in the file
  • enter the REST Query stage
  • go and execute a query against the Solr collection called metadata
  • return the results for the metadata collection
  • parse the XML with the 2 XPaths listed

Go run the enhanced-ds datasource! Go to the Query Workbench and check that the fields have been added to the document!

Let me know if you run into any problems!

Disclosures

Carlos Valcarcel is a full time employee of LucidWorks, but lives in New York as he prefers hurricanes to earthquakes. Having worked at IBM, Microsoft, and Fast Search and Transfer the only thing he is sure of is that the font editor he wrote on his Atari 800 was the coolest program he has ever written. While questions can be a drag he admits that answers will be harder to give without them.

The cat isn’t real, but then neither are you. Enjoy your search responsibly.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s