tag:support.flying-sphinx.com,2011-01-05:/discussions/questions/1522-are-our-wordform-and-stopword-files-being-usedFlying Sphinx: Discussion 2021-03-13T07:28:10Ztag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-20T22:08:18Z2020-10-20T22:08:18ZAre our wordform and stopword files being used?<div><p>We talked before about how it looked like our wordform or stopword files were not being found in production. We just deployed signficant changes to these that we'd like to be referenced. Can you confirm if these are or are not found in production properly?</p>
<p>Thanks!</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-21T12:22:39Z2020-10-21T12:22:39ZAre our wordform and stopword files being used?<div><p>Hi Sean,</p>
<p>Sorry for not responding sooner. Just looking into this now, and it seems neither of the wordform or stopword files are coming through with the configuration. Are you able to share the code you’re using to generate the configuration, as per what’s covered here: <a href="https://github.com/flying-sphinx/flying-sphinx-js#configuration">https://github.com/flying-sphinx/flying-sphinx-js#configuration</a></p>
<p>Cheers,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-21T21:10:23Z2020-10-21T21:10:25ZAre our wordform and stopword files being used?<div><p>No worries.</p>
<p>I tried updating the configuration to this. Can you check again and let us know if this is working?</p>
<pre>
<code>let configuration = flyingSphinx.configuration();
configuration.process('rebuild', function(configurer) {
configurer.addEngine('sphinx');
configurer.addVersion('2.2.11');
let fullSphinxConfig = fs.readFileSync(__dirname + '/../src/index-search/sphinx.conf');
configurer.addConfiguration(fullSphinxConfig);
let wordforms = fs.readFileSync(__dirname + '/../src/index-search/wordforms.txt');
configurer.addSettingFile('wordforms', 'wordforms.txt', wordforms);
let stopwords = fs.readFileSync(__dirname + '/../src/index-search/stopwords.txt');
configurer.addSettingFile('stopwords', 'stopwords.txt', stopwords);
});</code>
</pre>
<p>I do see this warning in our index command output:</p>
<blockquote>
<p>WARNING: stopwords: failed to get file size for '/mnt/local/flying-sphinx/552dde8b36b080ec0/stopwords/stopwords.txt'</p>
</blockquote></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-22T00:39:56Z2020-10-22T00:39:56ZAre our wordform and stopword files being used?<div><p>Hi Sean,</p>
<p>The uploaded configuration is still lacking those two additional files, but what you’re doing seems correct. 🤔</p>
<p>Two things to note, just to confirm we’re on the right track:</p>
<p>The first argument for the process call is the underlying command that gets invoked - so, in the code you’ve shared, it’s ‘rebuild’. However, I’m not seeing a rebuild command come through (rather, separate stop/start/index/configure commands). Are you invoking the code below as a replacement to the built-in flying-sphinx commands?<br>
And you may want to switch from ‘rebuild’ to ‘configure’ - as that way, it is <em>just</em> updating the configuration, rather than stopping the daemon, reconfiguring, indexing, and then starting the daemon again.</p>
<p>I’m going to review the underlying flying-sphinx-js code to confirm it’s behaving as we’re hoping as well!</p>
<p>Cheers,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-22T15:20:39Z2020-10-22T15:20:39ZAre our wordform and stopword files being used?<div><p>Ah, you are right. This code was not executed.</p>
<p>I updated it to be "configure" ran it, then rebuilt the index.</p>
<p>I think it still didn't find our wordforms from some test searches in production.</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-22T16:03:24Z2020-10-22T16:03:26ZAre our wordform and stopword files being used?<div><p>I did some more searching. It seems like we'll sometimes get results that implies the wordforms are working, but not always. We even added a unique term that maps to a term that for sure shows up in some of our documents, reindexed, searched the unique term, and found nothing.</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-22T22:29:44Z2020-10-22T22:29:44ZAre our wordform and stopword files being used?<div><p>I can confirm that the new configuration archive being sent isn’t making it through to the server. The tar.gz file being generated (which includes the Sphinx configuration file alongside the stopwords/wordforms files) is somehow invalid - the Ruby code on my servers can’t read it, and nor can my Mac. So, sounds like there’s a bug in flying-sphinx-js I need to fix (or the underlying libraries it’s depending on? 🤔). I will let you know when I’ve a new version of the library ready!</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-23T07:30:31Z2020-10-23T07:30:32ZAre our wordform and stopword files being used?<div><p>ok, thanks!</p>
<p>Let me know if there's anything we can do to help debug the issue.</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-23T11:52:21Z2020-10-23T11:52:21ZAre our wordform and stopword files being used?<div><p>I’ve just published v1.1.0 of the flying-sphinx package - I’ve switched out one arching library for another, and tested a script very similar to yours for uploading configuration. So, if you can update to this new release and give it a spin, that’d be great!</p>
<p>Cheers,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-23T11:53:07Z2020-10-23T11:53:07ZAre our wordform and stopword files being used?<div><p>… archiving, not arching. 🤷🏻‍♂️</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-23T15:29:10Z2020-10-23T15:29:10ZAre our wordform and stopword files being used?<div><p>Thanks!</p>
<p>I updated our package and re-ran our config/index. The latest index output doesn't complain about stopwords file size anymore, but nothing says whether or not it found stopwords or wordforms.</p>
<p>Some test searches make it seem like the wordforms are not being used. Can you confirm?</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-24T01:07:27Z2020-10-24T01:07:27ZAre our wordform and stopword files being used?<div><p>Hi Sean,</p>
<p>I can confirm that both files are coming through and being included in the configuration file correctly.</p>
<p>As to whether they’re working: if there’s particular queries you’re running that you’re finding aren’t returning the right records, let me know, but I just did some very simple tests:</p>
<p>Using the first wordform, “clinical abstractor > clinical_abstractor” - I ran queries on both of those as search terms, and they return the exact same results. Sphinx’s keyword information in the query response also suggests it’s functioning correctly, as it returns the keyword “clinical_abstractor” even when I search for that as separate words (clinical abstractor).<br>
For the stopwords, I searched for ‘hour’ (which is the fifth line in the stopwords file), and no results were returned, which is what I’d expect.</p>
<p>So I feel like the files are having the appropriate impact - but yeah, if there’s something that doesn’t look right to you, do let me know.</p>
<p>Cheers,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-24T17:28:39Z2020-10-24T17:28:39ZAre our wordform and stopword files being used?<div><p>We do see some results that make it look like things are working.</p>
<p>However, we put in a unique term to try to make sure. Perhaps we did this improperly, though. You can see a mapping where we added <code>poopsmith</code>:</p>
<pre>
<code>plumber, master plumber, service plumber, poopsmith => plumber</code>
</pre>
<p>But searching for that returns nothing.</p>
<p>If this is an issue with how we're using sphinx and not how you are hosting it, then we're happy to figure it out on our own. If you have any additional guidance, we appreciate it.</p>
<p>Thanks for the help!</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-25T01:45:39Z2020-10-25T01:45:39ZAre our wordform and stopword files being used?<div><p>Hi Sean,</p>
<p>I’m just reading through the docs, and I’m not sure if comma-separated terms is something that Sphinx expects in word forms files?<br>
<a href="http://sphinxsearch.com/docs/current.html#conf-wordforms">http://sphinxsearch.com/docs/current.html#conf-wordforms</a><br>
So, you might need to split this example into a few lines instead.</p>
<p>I don’t think this is related to Flying Sphinx, but certainly happy to help provide suggestions for debugging the issue anyway, if I can think of anything!</p>
<p>Cheers,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-26T18:13:28Z2020-10-26T18:13:31ZAre our wordform and stopword files being used?<div><p>The docs weren't clear. I tried changing the format to match and it worked!</p>
<p>Thanks for your help!</p></div>Sean Massatag:support.flying-sphinx.com,2011-01-05:Comment/487517602020-10-26T23:40:23Z2020-10-26T23:40:23ZAre our wordform and stopword files being used?<div><p>Great to hear it’s working! :)</p></div>Pat Allan