skip to Main Content

I have 9 fields to my user interface , each one is representing a column in my database table.
So the problem is that i need to cover all the possible combinations of the 9 fields and send specific queries to the database.
Example :
If the user puts 2 fields , i return all the items from the DB that matches to the combination of these 2 items.
If the user puts 3 fields , i return all the items from the DB that matches to the combination of these 3 items.
Its obvious that the combinations are 100+ if you calculate them.
Using JAVA and MySQL with JDBC driver, the Prepared Statement is provided which secures SQL Injection.
But this way , i should make 100+ functions just to cover all the combinations and the code will be repititive .

HOW I SOLVE IT :
I created a Query Generator which takes the input the user entered and creates an SQL QUERY as a String.
Then i pass it to the DB using the Statement Class . This way i cover all the possible combinations but i have to craft SQL Injection protection from zero by my own and generally it is a custom solution to a problem that i m sure its already solved.

If there is an official way to do this please let me know!

2

Answers


  1. SELECT table.*
    FROM table
    JOIN ( SELECT @column1 AS column1
                , @column2 AS column2
                     ......
                , @column9 AS column9 
           ) AS criteria ON ( table.column1 = criteria.column1 OR criteria.column1 IS NULL)
                        AND ( table.column2 = criteria.column2 OR criteria.column2 IS NULL)
                                          ..........
                        AND ( table.column9 = criteria.column1 OR criteria.column9 IS NULL)
    

    If the user have entered the value for some column then you provide entered criteria into the query else you provide NULL (command.Parameters.AddWithValue("@columnX", (object)value ?? DBNull.Value)).

    Login or Signup to reply.
  2. You have basically two possibilities to approach this problem.

    The first one uses one statement with the OR conditions handling the missing criteria.

    The second approach uses dynamically generated statement containing only the predicates with given values.

    The drawback of the first solution is that you get one statement, i.e. one execution plan that handles all inputs.

    The second solution allow to select the proper execution plan based on the input. The drawback is that you land with a different number of bind variables, which makes it technically difficult to handle.

    Here comes in the idea popularized for years by Tom Kyte

    You generate the predicate for each selection, but dependent on if the value is given or not:

    column = :bind_variable -- if the value is passed
    
    (1=1 or :bind_variable is NULL) – if no value is NOT passed
    

    The shortcut logic eliminates the predicates with no values but you get the complete number of bind variables in the prepared statement.

    Example with three column all passed

    where
    col1 = :1 and
    col2 = :2 and
    col3 = :3
    

    Example with one input

    where
    col1 = :1 and
    (1=1 or :2 is NULL) and
    (1=1 or :3 is NULL)
    

    Example with no input

    where
    (1=1 or :1 is NULL) and
    (1=1 or :2 is NULL) and
    (1=1 or :3 is NULL)
    

    Here the Oracle execution plan for the second example showing that only the col1 predicate is evaluated, all other are discarded. Similar is valid for PostgreSQL, I did not test MySQL.

    ------------------------------------------------------------------------------------------------
    | Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    ------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                    |          |     1 |    18 |     4   (0)| 00:00:01 |
    |   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| TEST     |     1 |    18 |     4   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN                  | COL1_IDX |     1 |       |     3   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("COL1"=TO_NUMBER(:1))
    

    For the third query (no criteria) you end with a full table scan

    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |  1000K|    17M|   896   (1)| 00:00:01 |
    |   1 |  TABLE ACCESS FULL| TEST |  1000K|    17M|   896   (1)| 00:00:01 |
    --------------------------------------------------------------------------
    

    So if the cardinality of the result is highly dependent on the selection this approach is more suitable that the one size fits all OR predicate solution.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search