Pat Allan on 03 Mar, 2018 12:22 AM
I’ve just been looking into this. Regarding the first issue: the integer out of range error is often due to timestamps rather than primary keys - Sphinx stores timestamps as 32-bit integers (based on UNIX timestamps), so anything that is beyond the year 2106 is going to cause problems. The only timestamp in your issue index is remote_created_at - do you allow human-entered values for that column? And are there any values that are 80+ years in the future? I'm guessing there might be some typos that you'll need to manually fix.
Regarding the second issue… it looks like you’re currently over the plan limit for the amount of data you’re dealing with, and are running out of disk space. As a temporary change, removing the :infixes option from the name field or increasing the :min_infixes_len setting should help reduce the amount of disk space required.
Also, within your tracker index, you may be able to improve the indexing speed, by removing the following attribute:
has '(select count(*) from issues where tracker_id=trackers.id)', as: :issue_count, type: :integer
And replacing it with these two lines:
has “COUNT(issues.id)”, as: :issue_count, type: :integer
Certainly, test that change locally first :)
Let me know how you go with all of this - certainly keen to get it all working with you!
Pat Allan on 09 Mar, 2018 01:57 PM
It seems the primary problem is disk space. So, I’ve temporarily shifted you over to a new server, processed the indices, and they’re *still* using all of the server’s disk space… (which is around 20GB - far more than the 5GB allowed by the plan you’re on).
We’re going to need to find a better long-term solution!
There’s two parts to this. First off: it’s looking like you’re currently using even more data than the next plan level (15GB for $300/month on a dedicated machine). These dedicated machines are not generally available (hence the plan’s not listed publicly), but I can fire one up for you if you definitely want to go down that path. If you feel the data’s only going to grow further, then I can consider what the options could be for a larger server and how much that might cost.
To get your disk usage under control at least for the short term, I would recommend you remove the infix setting from your Tracker index. This should reduce the size of that index significantly (and it seems it’s the largest), though it will mean partial/wildcard matches will not work.
Something that may also help, though likely not dramatically, is upgrading Sphinx. You’re using 2.1.4, and I recommend upgrading to 2.2.11 - this is controlled in your `config/thinking_sphinx.yml` file, but changing will require running the rebuild task, and that means Sphinx will be down for the full time it takes the indices to be reprocessed (which, going by the most recent run, is at least 4 hours). Also, if the indexing fails (like it has been recently!) then the Sphinx daemon may not be able to boot. So maybe this comes after getting the indices processing correctly on 2.1.4 first before considering such a change.
It does also seem like there might still be some invalid dates, and I’ll look into that over the coming days.
If any of this doesn’t make sense, or if you have further questions, do let me know.
Pat Allan on 11 Mar, 2018 12:18 AM
I've spent some time over the past day trying to figure out that integer-out-of-range issue, and not having much luck. The date values are indeed fine, and I don't see anything obvious about how the other attribute values could cause such a problem (all the integers are clearly less than what can be held within a 32-bit integer).
However, one thing that may help improve the processing speed for the Issue index: I realise you've got the languages association in Issue set up to go through tracker, but in the index it's better to be more explicit. The change:
has languages(:id), :as => :language_ids
has tracker.languages.id, :as => :language_ids
The reason for this is to ensure TS generates just one join to the trackers table (it's currently generating two). This may be fixed in later versions of TS (you're using quite an old version!), but this workaround will be more reliable.
Also, I've removed the symbol for :id - that was only needed for TS v2 and older (but the behaviour should remain the same, so it's not a big deal, and certainly not related to the problems at hand).