Lost Connections

forrest's Avatar

forrest

03 Feb, 2017 01:49 PM

I'm in the middle of the second outage I've seen in the past 8 hours. The errors coming in are:
ERROR MESSAGE:
ThinkingSphinx::SphinxError: Lost connection to MySQL server at 'reading authorization packet', system error: 0 - SELECT * FROM `matter_core`, `matter_delta` WHERE `user_id` = 686517 AND `sphinx_deleted` = 0 ORDER BY `updated_at` DESC LIMIT 0, 20; SHOW META

I'm able to `rake fs:index` but that doesn't resolve the issues. Your status page is reporting that our server, rusina, has no issues.

  1. Support Staff 1 Posted by Pat Allan on 03 Feb, 2017 01:55 PM

    Pat Allan's Avatar

    Hi Forrest,

    There aren't any known issues with rusina, nor with FS generally, so this is certainly surprising. I'll have a hunt around on the server right now to see if there are any clues there.

    Are any searches working at the moment? Is the problem intermittent, or everything's failing consistently?

    Regards,

    Pat

  2. 2 Posted by forrest on 03 Feb, 2017 02:00 PM

    forrest's Avatar

    I haven't been able to get a successful query through since opening the ticket, but it appears to be intermittent. Looks like the first error was ~1.5hrs ago but the error count doesn't _look_ high enough that it's failing on every request.

  3. 3 Posted by forrest on 03 Feb, 2017 02:03 PM

    forrest's Avatar
  4. 4 Posted by forrest on 03 Feb, 2017 02:05 PM

    forrest's Avatar

    New error: ThinkingSphinx::ConnectionError: Error connecting to Sphinx via the MySQL protocol. Error connecting to Sphinx via the MySQL protocol. Can't connect to MySQL server on 'ec2-54-242-154-181.compute-1.amazonaws.com' (111) - SELECT * FROM `matter_core`, `matter_delta` WHERE `sphinx_deleted` = 0 ORDER BY `updated_at` DESC LIMIT 0, 20; SHOW META

  5. Support Staff 5 Posted by Pat Allan on 03 Feb, 2017 02:11 PM

    Pat Allan's Avatar

    The Sphinx proxy seems to suddenly have gained an issue on the primary Rusina server (restarting the proxy didn't help, but that was the cause of the second error).

    Have just switched over to the failover Rusina server, can you confirm whether that's working as expected now?

  6. 6 Posted by forrest on 03 Feb, 2017 02:12 PM

    forrest's Avatar

    Yes it's back - thanks for the prompt response!

  7. Support Staff 7 Posted by Pat Allan on 03 Feb, 2017 02:59 PM

    Pat Allan's Avatar

    Just further to this: the Sphinx proxy is now working again on what was the primary server (which is now the new failover). Considering this issue resolved, but if you see a return of it, do let me know!

  8. 8 Posted by forrest on 03 Feb, 2017 04:25 PM

    forrest's Avatar

    Thanks, Pat. Will you be able to add better monitoring for these types of issues going forward or should I assume that a ticket will need to be opened?

    I disregarded the first round of errors I saw (at 4:30am my time) because _usually_ sphinx will recover itself.

  9. Support Staff 9 Posted by Pat Allan on 04 Feb, 2017 01:35 AM

    Pat Allan's Avatar

    Hi Forrest

    I’ve just rolled out some better error monitoring for the Sphinx proxy, so I should definitely have visibility on this issue if/when it crops up again.

    Cheers,


    Pat

  10. forrest closed this discussion on 11 May, 2017 11:09 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac