Ubuntu - Neo4j transaction log corruption

Doug
July 10, 2022
290 views
0 votes
2 Answers

Neo4j 4.2.1 Community edition on Ubuntu Server 20.04

A database I administer is failing to start with this error:

"Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Error reading transaction logs, recovery not possible. To force the database to start anyway, you can specify 'unsupported.dbms.tx_log.fail_on_corrupted_log_files=false'. This will try to recover as much as possible and then truncate the corrupt part of the transaction log. Doing this means your database integrity might be compromised, please consider restoring from a consistent backup instead."

If I roll back to the server instance from yesterday the database runs fine, but it goes through a recovery step as follows:

2022-07-10 12:21:23.825+0000 INFO  [o.n.k.d.Database] [neo4j/2443e357] Recovery required from position LogPosition{logVersion=0, byteOffset=191545629}
2022-07-10 12:21:27.676+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   10% completed
2022-07-10 12:21:28.578+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   20% completed
2022-07-10 12:21:29.715+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   30% completed
2022-07-10 12:21:31.078+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   40% completed
2022-07-10 12:21:32.140+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   50% completed
2022-07-10 12:21:32.709+0000 INFO  [o.n.k.i.a.i.IndexingService] [neo4j/2443e357] IndexingService.init: indexes not specifically mentioned above are ONLINE
2022-07-10 12:21:37.360+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   60% completed
2022-07-10 12:21:39.550+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   70% completed
2022-07-10 12:21:40.971+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   80% completed
2022-07-10 12:21:42.104+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   90% completed
2022-07-10 12:21:43.128+0000 INFO  [o.n.k.r.Recovery] [neo4j/2443e357]   100% completed
2022-07-10 12:21:43.151+0000 INFO  [o.n.k.d.Database] [neo4j/2443e357] Recovery completed. 195143 transactions, first:98943, last:294085 recovered, time spent: 18s 577ms

It clearly isn’t 100% ok though because if I try to run a backup with sudo neo4j-admin dump --database=neo4j --to=~/ I get the following error:

Active logical log detected, this might be a source of inconsistencies.
Please recover database before running the dump.
To perform recovery please start database and perform clean shutdown.

Starting and shutting it down makes no difference.

All the backups within our retention period have this problem.

We execute a script daily which performs a lot deletes and inserts on the database. When I run this on the working instance and re-start the database, the database fails to restart and I get the error I first listed again.

So it seems that the corruption in the transaction logs has been lingering for some time and that running this batch of deletes and inserts "pushes it over the edge", making it fail. Incidentally, this script has been running daily for 2 years now without any issues, so I’m sure it’s not the script itself causing problems.

I tried setting dbms.tx_log.rotation.retention_policy=keep_none before running the script and that made no difference, although the failed start error becomes:

Caused by: java.lang.RuntimeException: org.neo4j.exceptions.UnderlyingStorageException: No check point found in any log file from version 1 to 2

I also tried deleting the transaction log files as a desperate measure. That just broke things as expected.

I’m running community edition and my backups are EC2 server instances, so I don’t believe that I need the transaction logging feature.

How can I fix or remove the transaction logs please? Thank you.

Tags: neo4j

Answers

- RoyAwill
- July 10, 2022 at 4:55 pm
- 0 votes
0
Old transaction logs cannot be safely archived or removed. So you might use dbms.directories.transaction.logs.root to change the root location where Neo4j will store transaction logs.

Or, if the problem might be about memory, you can control which file size the logical log will auto-rotate by dbms.tx_log.rotation.size.

Login or Signup to reply.

- anonymous_user
- January 15, 2023 at 8:15 pm
- 0 votes
0
Had the same issue on my production database, only
neo4j-admin copy helped me

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Ubuntu – Neo4j transaction log corruption

Answers