We have Employee ID in over 50 different tables with various data types (varchar, int, text).
When these tables were populated via conversion, they should have been populated with 6 numbers. Unfortunately, some of the IDs had dropped leading zeros or the ID was appended with trailing spaces.
I should mention that all of the tables in question start with z%.
How can I find the TABLES that identify the Employee Ids with a length greater than or less than 6?
This is what my mind is thinking, but it’s obviously not right.
Select LENGTH(EMPLOYEE_ID), TABLE_NAME from (Select TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'mySchema' and TABLE_NAME like 'z%') a
where LENGTH(EMPLOYEE_ID) <> 6
GROUP BY TABLE_NAME, LENGTH(EMPLOYEE_ID)
What I don’t want to do is list out every table like:
Select LENGTH(EMPLOYEE_ID), TABLE_NAME from TABLE1
where LENGTH(EMPLOYEE_ID) <> 6
UNION
Select LENGTH(EMPLOYEE_ID), TABLE_NAME from TABLE2
where LENGTH(EMPLOYEE_ID) <> 6
UNION
Select LENGTH(EMPLOYEE_ID), TABLE_NAME from TABLE3
where LENGTH(EMPLOYEE_ID) <> 6
2
Answers
This might be tricky to do using purely SQL. I would advise using a scripting language like Python and taking advantage of pandas. You can do something like below:
So, what you want is a query that basically end up to something like this:
or maybe this (probably a quicker option than the query above):
.. without the hassle of writing the same query for each table. Your idea of using
information_schema
to build the query is in the right path but the way you’re doing it is incorrect. You’ll need to create the query, assign it into a variable then usePREPARE STATEMENT
to execute it.Here’s a demo fiddle