You are given a table, BST, containing two columns: N and P, where N represents the value of a node in Binary Tree, and P is the parent of N.
Question – Write a query to find the node type of Binary Tree ordered by the value of the node. Output one of the following for each node:
Root: If node is root node.
Leaf: If node is leaf node.
Inner: If node is neither root nor leaf node.
Solution 1 – I am using dot(.) notation on alias B.N=P
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST WHERE B.N=P)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST B
ORDER BY N;
Solution 2 – Not using and dot operator?
SELECT N,
CASE
WHEN P IS NULL THEN 'Root'
WHEN (SELECT COUNT(*) FROM BST WHERE N=P)>0 THEN 'Inner'
ELSE 'Leaf'
END AS PLACE
FROM BST
ORDER BY N;
My question is –
- Why do the Solution 1 is generating correct answer, is it due to . (dot) ? If it is due to dot operator why we didn’t use dot operation on P (B.N = P)?
- Even I modify solution 2 and write (BST.N = P), it is still giving me wrong answer. Why it is so?
I am confused in usage of .(dot)
3
Answers
You use
BST
twice in your query. The.
tells the DBMS which instance you are using. When omitted, the DBMS has to chose it implicitly.The table that is implicitly chosen happens not be the same throughout your query.
With more explicit aliases, your query is:
When you remove the aliases, the implicitly chosen instance of BST is:
SELECT COUNT(*) FROM BST InsideAlias
:InsideAlias
OutsideAlias
(InsideAlias
is out of scope for the rest of the query anyway).Which means:
is equivalent to
Therefore, you are getting the wrong results because it requires a node to be its own parent for the
COUNT(*)
to be greater than 0.Instead,
OutsideAlias.N=InsideAlias.P
translates to: is my node the parent of some other node? Another way to do the test is withEXISTS (SELECT * FROM BST WHERE OutsideAlias.N = P)
, although that was not your question.This is about correlated subquery. A correlated subquery is a subquery that contains a reference to a table that also appears in the outer query. And how do we distinguish the subquery table and the table from the outer query? There are two cases.
Therefore, to answer your first question,please check the comments besides the query:
Before answering your second question, how do we make two tables of the same name distinguishable? We need to give one of them a different name ,which calls for using an alias. But if you use
BST.N = P
(Here you didn’t state in your second question as where you would put the condition. From the context i presume you mean the subquery table condition) in the subquery, then this BST actually means the innertable,thus making the expressBST.N = P
same asN=P
(both prefixed using the innertable). To fix the issue, give the outer table an alias and use the aliase as prefix for the outertable columns which are used in the subquery.It has to do with namespaces — where columns and expressions "live".
A SQL query may include multiple namespaces where columns and expressions are named and can be accessible.
In your queries there are two namespaces:
In the first query, columns in the main query can be referenced by prepending the namespace
B
, as inB.<column>
while columns in the inner namespace can be referenced using the namespaceBST
as inBST.<column>
.If a column name (or expression name) does not explicitly includes a namespace, then the closer accessible one wins.
In your second query you don’t specify a namespace and, therefore, the columns
N
andP
in the expressionN = P
reference the same inner table, and so the subquery is not correlated to the main one. In the first queryB.N
references a column on the main query, and therefore the expressionB.N = P
compares columns from different tables, and then the query is correlated.