skip to Main Content

I have table with classification result of binary classification.

something like

id, predict_value, real_value
1,  true, true
2,  true, false
...

I want to get with sql classification_report, something like

   class   precision    recall  f1-score   support

   False       0.97      1.00      0.98     57241
    True       0.68      0.22      0.33      2323
accuracy       NULL      NULL      0.97     59564

Can you help me with sql query ?

2

Answers


  1. Chosen as BEST ANSWER

    I have next ugly sql as soltion

    with _tmp as (select 'accuracy' as class,
                         null       as pricission,
                         null       as recall,
                         count(*)   as support
                  from my_data
                  UNION
                  select 'True'                                                          as class,
                         (count(*) filter ( where predict_value = True and real_value = True ))::float
                             /
                         (count(*) filter ( where predict_value = True))::numeric(12, 6) as pricission,
                         (count(*) filter ( where predict_value = True and real_value = True ))::float
                             /
                         (
                                         count(*) filter ( where predict_value = FALSE and real_value = True)
                                 +
                                         count(*) filter ( where predict_value = True and real_value = True )
                             )::float                                                    as recall,
                         null                                                            as f1_score,
                         count(*) filter ( where predict_value = True)                   as support
                  from my_data
                  Union
                  select 'False'                                                          as class,
                         (count(*) filter ( where predict_value = FALSE and real_value = FALSE ))::float
                             /
                         (count(*) filter ( where predict_value = FALSE))::numeric(12, 6) as pricission,
                         (count(*) filter ( where predict_value = FALSE and real_value = FALSE ))::float
                             /
                         (
                                         count(*) filter ( where predict_value = True and real_value = FALSE)
                                 +
                                         count(*) filter ( where predict_value = FALSE and real_value = FALSE )
                             )::float                                                     as recall,
                         null                                                             as f1_score,
                         count(*) filter ( where fixed_predict = FALSE)                   as support
                  from my_data)
    select "class",
           pricission::numeric(12, 3),
           recall::numeric(12, 3),
           (2 * pricission * recall / (pricission + recall))::numeric(12, 3) as f1_score,
           support
    from _tmp
    order by support
    

  2. The following might prove useful.

    WITH classes AS (SELECT my_data.predict_value AS class
                       FROM my_data
                     UNION
                     SELECT my_data.real_value AS class
                       FROM my_data),
         t AS (SELECT c.class,
                      COUNT(*) FILTER (WHERE c.class = d.predict_value)                                              AS SUPPORT,
                      (COUNT(*) FILTER (WHERE c.class = d.predict_value AND d.predict_value = d.real_value))::float  AS tp,
                      (COUNT(*) FILTER (WHERE c.class = d.predict_value AND d.predict_value <> d.real_value))::float AS fp,
                      (COUNT(*) FILTER (WHERE c.class <> d.predict_value AND c.class <> d.real_value))::float        AS tn,
                      (COUNT(*) FILTER (WHERE c.class <> d.predict_value AND c.class = d.real_value))::float         AS fn
                 FROM classes c
                   CROSS JOIN my_data d
                 GROUP BY c.class),
         pr AS (SELECT t.class::text                                               AS class,
                       CASE t.support WHEN 0 THEN NULL ELSE t.tp / t.support END   AS precision,
                       CASE t.support WHEN 0 THEN NULL ELSE tp / (t.tp + t.fn) END AS recall,
                       t.support
                  FROM t
                UNION ALL
                SELECT 'Accuracy'                          AS class,
                       SUM(t.tp) / SUM(t.support)          AS precision,
                       SUM(t.tp) / (SUM(t.tp) + SUM(t.fn)) AS recall,
                       SUM(t.support)                      AS support
                  FROM t)
    SELECT pr.class,
           pr.precision,
           pr.recall,
           2 * pr.precision * pr.recall / (pr.precision + pr.recall) AS "f1-score",
           pr.support
      FROM pr
      ORDER BY pr.class;
    

    This query works for any positive number of classes, not just binary classification.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search