tag:support.flying-sphinx.com,2011-01-05:/discussions/problems/2138-delayeddelta-doesnt-seem-to-be-updating-indexesFlying Sphinx: Discussion 2016-07-27T20:12:45Ztag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-26T15:15:24Z2016-07-26T15:15:24ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Erhan</p>
<p>Can you let me know what the app in question is? I’ll have
a look at how the delta index processing is going on the Flying
Sphinx side of things.</p>
<p>Also: how often are you processing all indices (via <code>heroku
run bundle exec flying-sphinx index</code> or <code>heroku run rake
fs:index</code>)? And it’s important to note that the delayed
delta jobs won’t set the delta flag back to true - this only
happens when all the indices are processed. That said, records
should certainly appear in results!</p>
<p>Thanks,</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-26T15:31:20Z2016-07-26T15:31:21ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Pat,</p>
<p>Thanks so much for your help.</p>
<p>My app is "fh7020".</p>
<p>I see the thinkingsphinx delayed_job record being created after
the record update, and I see it getting deleted as well. But,
updated record doesn't get returned in search until I do a rake
fs:index manually. And the "delta" field is set to true and stays
that way till I do a full rake fs:index as well.</p>
<p>Isn't delayeddelta supposed to re-index updated records
automatically through the delayed job? Am I still supposed to
schedule to index every so often? Maybe I am missing how delta
works?</p>
<p>Thank you so much for your help again. Looking forward to your
response.</p>
<p>-Erhan</p></div>Erhan Berbertag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T01:26:15Z2016-07-27T01:26:15ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Erhan,</p>
<p>Good to know the jobs are being created. From what I can see,
deltas are being processed in your app (you can check this from
your dashboard as well via <code>heroku addons:open
flying_sphinx</code>).</p>
<p>Deltas do re-index updated records automatically, but the
advantage of delta indices is that they only cover a small number
of records, and thus are super fast to process. The more changes
you make to records, the larger the delta indices grow, and thus
the slower they become to process. Processing <em>all indices</em>
will bring all the delta changes into the core indices, and return
the delta index to being empty - thus, keeping them small and
fast.</p>
<p>So, even with using delayed delta, you’ll still want to
run the command to process all indices regularly - whether that be
daily or hourly, depending on the amount of data you have. From
what I can see, your overall amount of data is not super large, so
running the task daily should be fine.</p>
<p>Now of course, this is not quite related to the issue
you’ve been seeing where updated records aren’t
matching search queries. If this issue is still happening, can you
run me through the index definition, the search query you’re
running, and the versions of the thinking-sphinx, ts-delayed-delta
and flying-sphinx gems you’re using?</p>
<p>Cheers,</p>
<p>Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T04:28:30Z2016-07-27T04:28:31ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Pat,</p>
<p>Thank you for your responsiveness as usual.</p>
<p>My first question: Did you really mean to say "do" or "don't" in
"Deltas do<br>
re-index updated records automatically" ?</p>
<p>To answer your question about search results not matching. My
issue is that<br>
the delta=true results are not being returned at all. This is
still<br>
happening.</p>
<p>My index definition:<br>
ThinkingSphinx::Index.define :user, :with => :active_record,
:delta =><br>
ThinkingSphinx::Deltas::DelayedDelta do<br>
indexes [first_name, last_name], :as => :name, :sortable =>
true indexes last_name, :sortable => true indexes email,
:sortable => true</p>
<p>set_property :enable_star => 1 set_property :min_infix_len
=> 1 end</p>
<p>In my user model I have a sphinx scope as:<br>
sphinx_scope(:by_name) { |name| {:conditions => {:name =>
"#{name}*"}} }</p>
<p>When I do a search like below it only returns the records where
delta != true<br>
User.by_name("wick")</p>
<p>My related gems are:<br>
gem 'mysql2', '~> 0.3.13'<br>
gem 'thinking-sphinx', '3.1.0'<br>
gem 'flying-sphinx', '1.2.0'<br>
gem 'ts-delayed-delta' (in gemfile.lock (2.0.2) is the version
of<br>
ts-delayed-delta)</p>
<p>Thank you so much again for your help. Looking forward to your
response.</p>
<p>-Erhan</p></div>Erhan Berbertag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T05:13:44Z2016-07-27T05:13:44ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Erhan,</p>
<p>Thanks for all of those details. ‘do’ is what I
meant - the delta processing DOES process the delta index
automatically if you’re modifying your model’s records
in ways that fire standard ActiveRecord callbacks - and given the
delta flag is automatically being set to true, and you’ve
mentioned the jobs are being added to the DJ queue, it sounds like
that side of things is working as expected.</p>
<p>Looking back over the past 24 hours or so, it seems the only two
delta requests that have made it through to Flying Sphinx are for
the Challenge model. I just manually ran a delta request for the
User model, and now there is a search result there matching
‘wick’. So, it seems the issue is possibly
model-specific and perhaps somewhere in the connection between
Delayed Job and the Flying Sphinx API. Are you seeing any
exceptions when the delta job is processed by Delayed Job?</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T07:40:10Z2016-07-27T07:40:11ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Pat,</p>
<p>Thank you for the great response.</p>
<p>From what I see in the logs the DJ entry seems to be created
normally after a<br>
user record update. Then the queue seems to be getting processed as
well, but<br>
it seems to me like somehow the updated record's index is removed
from sphinx<br>
core or maybe just omitted from search results somehow?</p>
<p>And isn't the delta flag supposed to get set back to false after
index is<br>
automatically rebuilt through DJ? Like I said before it gets stuck
at "true"<br>
until the next re-indexing.</p>
<p>You are right, the "Challenge" model seems to be working fine.
When I looked<br>
at the logs after a challenge update I see the following lines in
the logs<br>
during DJ processing that I am not seeing after a "User" update.
What could<br>
be causing this? I am doing a regular update through
activerecord.</p>
<p>2016-07-27T07:33:35.398784+00:00 app[worker.1]:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Process challenge_delta (id=1893364) COMPLETED
after 1.0858<br>
2016-07-27T07:33:35.398851+00:00 app[worker.1]:
2016-07-27T07:33:35+0000:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Process challenge_delta (id=1893364) COMPLETED
after 1.0858</p>
<p>And below is the logs right after a "User" update:</p>
<p>2016-07-27T07:26:38.039845+00:00 app[web.1]: SQL (0.7ms) UPDATE
"users" SET<br>
"perishable_token" = $1, "updated_at" = $2 WHERE "users"."id" = $3
[["perishable_token", "JSUTqclb4CsU7O9k1cbp"], ["updated_at",
"2016-07-27 07:26:38.037492"], ["id", 4]]<br>
2016-07-27T07:26:38.042310+00:00 app[web.1]: (1.7ms) COMMIT<br>
2016-07-27T07:26:38.044156+00:00 app[web.1]: (0.8ms) SELECT
COUNT(*) FROM<br>
"delayed_jobs" WHERE "delayed_jobs"."handler" = $1 AND
"delayed_jobs"."locked_at" IS NULL [["handler", "---
!ruby/object:ThinkingSphinx::Deltas::DelayedDelta::DeltaJob\nindex:
user_delta\n"]]<br>
2016-07-27T07:26:38.045707+00:00 app[web.1]: (0.6ms) BEGIN<br>
2016-07-27T07:26:38.047718+00:00 app[web.1]: SQL (0.7ms) INSERT
INTO<br>
"delayed_jobs" ("handler", "run_at", "created_at", "updated_at")
VALUES ($1, $2, $3, $4) RETURNING "id" [["handler", "---
!ruby/object:ThinkingSphinx::Deltas::DelayedDelta::FlagAsDeletedJob\nindex:
user_core\ndocument_id: 38\n"], ["run_at", "2016-07-27
07:26:38.045885"],<br>
["created_at", "2016-07-27 07:26:38.046141"], ["updated_at",
"2016-07-27 07:26:38.046141"]]<br>
2016-07-27T07:26:38.049633+00:00 app[web.1]: (1.5ms) COMMIT<br>
2016-07-27T07:26:38.051274+00:00 app[web.1]: Rendered
users/_user_row.haml<br>
(0.8ms) 2016-07-27T07:26:38.051423+00:00 app[web.1]: Rendered
users/update.js.erb<br>
(1.1ms) 2016-07-27T07:26:38.049187+00:00 app[worker.1]:<br>
Delayed::Backend::ActiveRecord::Job Load (1.7ms) UPDATE
"delayed_jobs" SET<br>
locked_at = '2016-07-27 07:26:38.046718', locked_by =<br>
'host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3' WHERE id IN
(SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at
<= '2016-07-27 07:26:38.046046' AND (locked_at IS NULL OR
locked_at < '2016-07-27<br>
03:26:38.046089') OR locked_by =
'host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9<br>
pid:3') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC
LIMIT 1 FOR<br>
UPDATE) RETURNING *<br>
2016-07-27T07:26:38.059470+00:00 app[web.1]: Completed 200 OK in
34ms (Views:<br>
1.5ms | ActiveRecord: 12.1ms)<br>
2016-07-27T07:26:38.057959+00:00 heroku[router]: at=info
method=POST<br>
path="/users/4" host=app.70-20.com<br>
request_id=8ff92224-bf32-43e3-9774-935054f6a3cf fwd="5.25.3.251"
dyno=web.1<br>
connect=1ms service=184ms status=200 bytes=1900<br>
2016-07-27T07:26:43.056718+00:00 app[worker.1]:<br>
Delayed::Backend::ActiveRecord::Job Load (5.3ms) UPDATE
"delayed_jobs" SET<br>
locked_at = '2016-07-27 07:26:43.050669', locked_by =<br>
'host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3' WHERE id IN
(SELECT "delayed_jobs"."id" FROM "delayed_jobs" WHERE ((run_at
<= '2016-07-27 07:26:43.050070' AND (locked_at IS NULL OR
locked_at < '2016-07-27<br>
03:26:43.050105') OR locked_by =
'host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9<br>
pid:3') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC
LIMIT 1 FOR<br>
UPDATE) RETURNING *<br>
2016-07-27T07:26:43.058427+00:00 app[worker.1]:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Mark 38 in user_core as deleted (id=1893345)
RUNNING<br>
2016-07-27T07:26:43.058514+00:00 app[worker.1]:
2016-07-27T07:26:43+0000:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Mark 38 in user_core as deleted (id=1893345)
RUNNING<br>
2016-07-27T07:26:43.066794+00:00 app[worker.1]: (1.1ms) BEGIN<br>
2016-07-27T07:26:43.071514+00:00 app[worker.1]: SQL (3.1ms) DELETE
FROM<br>
"delayed_jobs" WHERE "delayed_jobs"."id" = $1 [["id", 1893345]]
2016-07-27T07:26:43.075452+00:00 app[worker.1]: (3.0ms) COMMIT<br>
2016-07-27T07:26:43.076120+00:00 app[worker.1]:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Mark 38 in user_core as deleted (id=1893345)
COMPLETED after 0.0175<br>
2016-07-27T07:26:43.076204+00:00 app[worker.1]:
2016-07-27T07:26:43+0000:<br>
[Worker(host:8f8492bd-cf16-4c34-95ec-76a916ecc7f9 pid:3)] Job
Thinking Sphinx: Mark 38 in user_core as deleted (id=1893345)
COMPLETED after 0.0175</p>
<p>Eager to hear your thoughts. Thank you so much again.</p>
<p>-Erhan</p></div>Erhan Berbertag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T07:57:42Z2016-07-27T07:57:42ZDelayedDelta doesn't seem to be updating indexes<div><p>I’ve got to duck out for something, but just quickly: it
looks like the delta job for user_delta isn’t being run - or
even perhaps persisted? Are there any old, failing jobs?
ts-delayed-delta won’t double-up on delta requests if it
doesn’t have to, but it should be checking only active
jobs… still, worth confirming.</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T10:24:51Z2016-07-27T10:24:52ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Pat,</p>
<p>I think you got it. After I emptied the delayed_job queue, the
search started<br>
working just fine. Wow!</p>
<p>Thank you so much, Pat. Sorry to bug you with this, it has been
a great<br>
exercise for me though.</p>
<p>I have one final question, do you still suggest me to do a full
index every<br>
day or so? What would be the reason for that if so?</p>
<p>-Erhan</p></div>Erhan Berbertag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T14:06:47Z2016-07-27T14:06:47ZDelayedDelta doesn't seem to be updating indexes<div><p>Great to hear it’s working again :)</p>
<p>I guess there’s room for improvement in the
ts-delayed-delta gem… the check for existing jobs should be a
bit fussier.</p>
<p>I understand the need to run a full index regularly is a bit
counter-intuitive… I’ll try to explain with a broad
example. Let’s say you have one thousand users. One user
object changes, and so its delta flag is set to true, and the delta
index processes super quickly - it only has to deal with that
single record.</p>
<p>Then some time passes, and four more users have had changes,
with their delta flags set to true. Again, the delta index
processes each time, and it’s still pretty quick, as there
are only five records involved.</p>
<p>But over the course of a week, it could be that 200 users have
had updates to their data, which means 200 user records with delta
flags set to true. The delta indexing after each update is now a
fair bit slower - there’s a bunch more data being processed.
This means that search results may not be returning accurately
quite as quickly as you were hoping.</p>
<p>If you never run another full index, you could end up with
<em>all</em> users having a delta flag set to true, and then delta
processing will be much slower again, meaning accurate search
results take longer to be returned.</p>
<p>If you run a full index regularly, then the core index has the
latest changes available, all users have their delta flags set back
to false, and then the next update to a user means there’s
just a single delta record, and delta processing is super fast
again.</p>
<p>Does that help clarify things?</p>
<p>Also, just to explain why the delta flag remains true after a
delta processing… user A is edited, has their delta flag set,
and the changes to their data are processed in the delta index. If
the delta flag was then reset to false, and then User B is edited,
B’s latest changes will now be present in the delta index,
but A’s wouldn’t be, because the delta index only
processes records where delta is set to true. Thus, A’s data
would not be present at all in search results until a full
index.</p>
<p>Any further questions or issues, do let me know!</p>
<p>Cheers</p>
<p>— Pat</p></div>Pat Allantag:support.flying-sphinx.com,2011-01-05:Comment/404067872016-07-27T20:12:44Z2016-07-27T20:12:45ZDelayedDelta doesn't seem to be updating indexes<div><p>Hi Pat,</p>
<p>Thank you for the great explanations of both the need of the
full-indexing and<br>
the reason why the delta flag remains at true. It all makes sense
and really<br>
helped me understand these two important subjects.</p>
<p>I am all good now and really love using sphinx. Thank you for
your<br>
responsiveness and great help.</p>
<p>Stay well.</p>
<p>-Erhan</p></div>Erhan Berber