I need to get the initial state as well as the latest state from a MySQL database. This is over two tables:
customer
id | name | surname | dob | telephone | |
---|---|---|---|---|---|
10 | Steve | Bobbly | 01-01-1970 | [email protected] | 0123456789 |
15 | James | Bond | 01-01-1950 | [email protected] | 0101010999 |
audit_log
id | entity_id | property | old_value | new_value |
---|---|---|---|---|
1 | 10 | name | John | Steve |
2 | 10 | [email protected] | [email protected] | |
3 | 10 | telephone | 0123456789 |
What I expect is output like this:
id | name | surname | dob | telephone | |
---|---|---|---|---|---|
10 | Steve | Bobbly | 01-01-1970 | [email protected] | 0123456789 |
10_1 | John | Bobbly | 01-01-1970 | [email protected] | |
15 | James | Bond | 01-01-1950 | [email protected] | 0101010999 |
I initially had a PHP script that runs through all the customer
rows, and then matches them to the audit_log
rows and generate output from there, but the speed is EXTREMELY slow and resource intensive.
Would something like this be possible directly in MySQL, and how would I do it?
EDIT
I’ve added additional rows to the customer
and the output tables. The output table needs to contain all rows in customer
, as well as a copy of the initial row, built from audit_log
.
4
Answers
Considering there’s at most one update per each
entity
–property
pairs per eachentity_id
value, you can use this SQL Select statementwhere changing
properties
which take place in audit_log table withold
vs.new
values are unpivoted, and the others(non-changing ones) taken fromcustomer
tableDemo
Nice question! 🙂 Here is one way if you don’t mind some repetition:
Demo (shamelessly borrowing from the one set up by Barbaros Özhan)
Note: The above will include the
_1
entries whenever there are audit entries. But just the presence of audit entries doesn’t guarantee anything has changed – e.g. if the surname was changed from "Smith" to "Jones" and then back to "Smith". If this is important to you I may be able to modify it, at the expense of more complexity.Try the following:
See a demo.
This query simulates
ROW_NUMBER() OVER (PARTITION BY entity_id, property ORDER BY id)
for the audit_log table to get the initial value for each customer/ property (where rownum = 1) .The
COALESCE
is used to get the value of a property from the customers table, if this property is not changed, i.e. the name is changed but the email is not, then for email get the lastest value (in this case the latest=initial, which is T.email in this query).Sometimes a tough programming task is best handled by stepping back and rethinking the framework.
I would re-think the schema design. Instead of this complex query, I would have 3 tables, making the query ‘trivial’:
Original
— The values when the person is first put into the database.Audit
— The blow-by-blow — a historical record of all the changes. (Optionally like you have now. Or possibly a copy of theCurrent
row when the change occurred.)Current
– The latest values.Then the query is essentially a
UNION
ofOriginal
andAudit
.