30th Sept. 2022

Database corruption issue [MySQL]

Yesterday, Screener.in stopped working in the evening. I logged in via remote shell, ran `top` and saw mysql using all the memory.

Next I did `tail -f /var/log/mysql/error.log` and saw this frightful error:

10:20:39 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0xffed4075d9f0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = ffff804c1748 thread_stack 0x100000

Nothing else after that.

It was a time to panic! This was the first time I was experiencing a MySQL corruption issue.

Luckily, the issue was resolved with `mysqlcheck --all-database` and `mysqlcheck -o db_name corrupted_table_name`. This StackOverflow answer helped.

This is the checklist we created for the next time:

# copy error log
scp -C screener_prod:/var/log/mysql/error.log ./

# put site on maintainance
# also disable crons
read deploy.sh

# clear mailq for error emails
sudo postsuper -d ALL

# reboot system
sudo reboot

# create a back-up
from AWS account

# get exact time of incidence
# check error log
vi /var/log/syslog

# run mysqlcheck
sudo mysqlcheck --all-databases

# run the above command again (and again) if it shows "MySQL" gone away
sudo mysqlcheck --all-databases

# hope the errors are only in indexes
sudo mysqlcheck -o db_name tbl_name

# the errors during mysqlcheck are again logged in error log
scp -C screener_prod:/var/log/mysql/error.log ./
# analyse the above

Liked this post?

Get new posts in your email. The updates are free.