wordforms works, but only unidirectionally

Pieter's Avatar

Pieter

06 Aug, 2011 05:00 AM

My understanding is that wordforms works like this. If I include this in the wordforms.txt

yellow > gold

"yellow" should appear when searching for "gold", and "gold" should appear when searching for "yellow". At the moment "yellow" is matched when you search "gold", but not the other way around

Is there a setting somewhere to modify the default?

  1. Support Staff 1 Posted by Pat Allan on 06 Aug, 2011 08:49 AM

    Pat Allan's Avatar

    Hi Pieter

    I'm not sure if it works both ways - it's not entirely clear in the Sphinx documentation, and I've not used wordforms myself. One thing to double check is that you need to restart Sphinx and reindex whenever you change the wordforms file - heroku rake fs:index fs:restart should do that job (as would fs:rebuild, though that'll bring Sphinx down while indexing happens).

    Can you confirm whether it works as expected after restarting and reindexing?

  2. 2 Posted by Pieter on 06 Aug, 2011 08:54 AM

    Pieter's Avatar

    Server was restarted and no difference. Only (unofficial) discussions I've found have suggested that is not direction specific, but it seems to be in our case.

  3. Support Staff 3 Posted by Pat Allan on 07 Aug, 2011 03:17 AM

    Pat Allan's Avatar

    Just following up on this - I've done some testing, you're right, it should work in both directions (and does for my test). Is 'gold' and 'yellow' the failing example from your file? Or is it something else?

    The reason I ask is that wordforms are applied after tokenising via the charset table (which defaults to the UTF-8 set - notably, converting upper-case to lower-case, probably stripping out punctuation as well) - so you'll want to make sure your wordforms file only has words in their tokenised form (lowercase, no punctuation).

  4. 4 Posted by Pieter on 07 Aug, 2011 04:03 AM

    Pieter's Avatar

    There were a few apostrophes in there. Got rid of them and still no change.
    Converted all to lowercase. No change
    I've duplicated each term make specifying them in each direction for the moment and it works for the moment. Not a big deal.

  5. Pat Allan closed this discussion on 07 Aug, 2011 12:06 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac