Category Archives: tutorial

Fusion 4: Regex Field Replacement Index Pipeline Stage


This will be a short one. At least the cat hopes so.

The question of how to change a date from something like 2020-04-01 into something less foolish (like 2020-04) came up recently and I couldn’t help but feel the pull of a simplistic solution (simple as well, but simplistic was the draw). This is something that is applicable to numerous scenarios where a string and its composite parts might be better off being rearranged (kind of like kaleidoscope without all the colors).

As Fusion 4 is available I will be using that for this example. Continue reading

Advertisements

Fusion 3.1: Multi-term synonyms!


Yes, my example is going to be trivial.

No, the cat is not happy with that.

Yes, I am doing it anyway.

With the advent of Solr 6.5 we have (drum roll, please) multi-term synonym support! Yes! Do that happy dance, but remember not to scuff up the floor too much.

Let’s run through a trivial example to show it off. Continue reading

Fusion 3.1.0: How to Use The REST Query Index Pipeline Stage


Fusion 3.1, everybody! The following may or may not work on past, or future, versions of Fusion.

Don’t have it? Go get it! Don’t make the cat do all the work.

Question: How do I use the Fusion REST Query index pipeline stage to add additional metadata to an inbound document?

Answer: This assumes the existence of a Solr collection with metadata and that Fusion knows of its existence (that means either use the default Solr cluster that runs within Fusion or make sure that the external Solr cluster you are using is registered with Fusion).

The basic steps:

  • create a collection to store the metadata and populate it the metadata of your choice
  • create a collection which will hold new enhanced content with additional metadata from the first collection
  • configure the index pipeline of the second collection to include the REST Query stage which will make a query to the first collection and add some content to the current inbound content of the second collection

Some detail:
Continue reading

LWS: Crawling the Web (the simple version)


So I will safely assume that you have read the post on crawling your local file system because you downloaded LWS and couldn’t contain yourself. You are now beside yourself with excitement that you can’t wait to try crawling the web. The cat is beside me and excited is not what I feel. Let’s start at Continue reading

LWS: Using SolrXML To Crassly Manipulate Solr


As I mentioned in the previous post: SolrXML isn’t just for sending documents into Solr. It is also for sending messages to Solr about things you would like it to do: add one or more documents, update one or more documents or delete one or more documents (there are a few other messages you can send, but why quibble over details).

Let’s look at these in turn.

If you want to follow along then do the following Continue reading

LWS: How to Index SolrXML


[I will be using Linux 13.03 with LucidWorks Search 2.6.2.]
Following the continuing saga of doing the easy stuff we will look at another of the standard data sources available in LucidWorks Search: SolrXML.

At its most basic SolrXML is made up of a root element that tells Solr what to do with the incoming document(s). For this deeply moving episode the XML document will look something like: Continue reading