Saturday, February 25, 2012

Really complex query that has me stymied

I'm attempting to write a query that pulls data from two tables in a really
complex way, and I can't figure a way around it. Because of the architecture
in use, it has to be done in a single query, without using a stored
procedure. This is a simplified version of what I'm doing:
Let's say I have two tables, each with four columns, named A, B, C and D:
Table 1
A,B,C,D
1,2,3,5
3,5,7,9
3,9,2,6
3,6,7,1
9,8,3,0
2,5,8,1
Table 2
A,B,C,D
5,3,2,3
4,8,7,1
3,5,7,2
1,4,5,6
8,9,3,2
2,6,2,2
Each of the tables has thousands of records. I need a query that does this:
For each row in table 1, look up the values in table 2. If a row can be
found in table 2 whose columns A, B and C match columns A, B and C in table
1, and whose column D equals 2, then return the row from table 1. Do not
return a row unless all three columns A, B and C match, and Table2.D equals
2.
Using this rule on the table values above, it should be returning only the
row "3,5,7,9" from Table 1 because there is only one row in Table 2 that has
columns A,B,C,D equal to "3,5,7,2"
Here's what I'm trying now:
SELECT * FROM Table1
WHERE
Table1.A IN (SELECT DISTINCT(A) FROM Table2 WHERE Table2.D = 2)
AND
Table1.B IN (SELECT DISTINCT(B) FROM Table2 WHERE Table2.D = 2)
AND
Table1.C IN (SELECT DISTINCT(C) FROM Table2 WHERE Table2.D = 2)
This returns 3,5,7,9, 3,9,2,6 and 3,6,7,1 from Table 1, when it should only
be returning 3,5,7,9. It does this because column A matches (3), Column B
matches multiple records in Table 2 with a Table2.D=2, and so does Column C.
Even though I am specifying AND for the three columns, it is actually acting
as an OR, because each SELECT DISTINCT is separate, and does not take into
consideration the other SELECTs. The first SELECT DISTINCT pulls a result
set, and Column A is compared against it. The second SELECT DISTINCT pulls a
different result set, and Column B is compared against it. And so on. This
results in an additive comparison. What I need, in effect, is for the SELECT
DISTINCT(B) to be more like "Take the SELECT DISTINCT(A) result set and pare
it down further using the SELECT DISTINCT(B) values". Then The SELECT
DISTINCT(C) needs to be more like "Take the SELECT DISTINCT(B) result set
and pare it down even further using the SELECT DISTINCT(C) values".
I hope this makes sense, and that someone can see what I'm missing. I've
spent hours trying to figure out a way of doing this, without success.Scott,
Try this:
SELECT A, B, C, D
FROM [Table 1] AS T1
WHERE EXISTS (
SELECT * FROM [Table 2] AS T2
WHERE T2.A = T1.A
AND T2.B = T1.B
AND T2.C = T1.C
AND T2.D = 2
)
or
SELECT T1.A, T1.B, T1.C, T1.D
FROM [Table 1] AS T1
JOIN [Table 2] AS T2
ON T2.A = T1.A
AND T2.B = T1.B
AND T2.C = T1.C
WHERE
T2.D = 2
The latter query will return duplicate rows, so it could need
DISTINCT. Use whichever is faster (hopefully at least one
has no typos).
Steve Kass
Drew University
Scott MacLean wrote:

>I'm attempting to write a query that pulls data from two tables in a really
>complex way, and I can't figure a way around it. Because of the architectur
e
>in use, it has to be done in a single query, without using a stored
>procedure. This is a simplified version of what I'm doing:
>Let's say I have two tables, each with four columns, named A, B, C and D:
>Table 1
>A,B,C,D
>1,2,3,5
>3,5,7,9
>3,9,2,6
>3,6,7,1
>9,8,3,0
>2,5,8,1
>Table 2
>A,B,C,D
>5,3,2,3
>4,8,7,1
>3,5,7,2
>1,4,5,6
>8,9,3,2
>2,6,2,2
>Each of the tables has thousands of records. I need a query that does this:
>For each row in table 1, look up the values in table 2. If a row can be
>found in table 2 whose columns A, B and C match columns A, B and C in table
>1, and whose column D equals 2, then return the row from table 1. Do not
>return a row unless all three columns A, B and C match, and Table2.D equals
>2.
>Using this rule on the table values above, it should be returning only the
>row "3,5,7,9" from Table 1 because there is only one row in Table 2 that ha
s
>columns A,B,C,D equal to "3,5,7,2"
>Here's what I'm trying now:
>SELECT * FROM Table1
>WHERE
> Table1.A IN (SELECT DISTINCT(A) FROM Table2 WHERE Table2.D = 2)
>AND
> Table1.B IN (SELECT DISTINCT(B) FROM Table2 WHERE Table2.D = 2)
>AND
> Table1.C IN (SELECT DISTINCT(C) FROM Table2 WHERE Table2.D = 2)
>This returns 3,5,7,9, 3,9,2,6 and 3,6,7,1 from Table 1, when it should only
>be returning 3,5,7,9. It does this because column A matches (3), Column B
>matches multiple records in Table 2 with a Table2.D=2, and so does Column C
.
>Even though I am specifying AND for the three columns, it is actually actin
g
>as an OR, because each SELECT DISTINCT is separate, and does not take into
>consideration the other SELECTs. The first SELECT DISTINCT pulls a result
>set, and Column A is compared against it. The second SELECT DISTINCT pulls
a
>different result set, and Column B is compared against it. And so on. This
>results in an additive comparison. What I need, in effect, is for the SELEC
T
>DISTINCT(B) to be more like "Take the SELECT DISTINCT(A) result set and par
e
>it down further using the SELECT DISTINCT(B) values". Then The SELECT
>DISTINCT(C) needs to be more like "Take the SELECT DISTINCT(B) result set
>and pare it down even further using the SELECT DISTINCT(C) values".
>I hope this makes sense, and that someone can see what I'm missing. I've
>spent hours trying to figure out a way of doing this, without success.
>
>|||Hello, Scott
This should do the job:
SELECT * FROM Table1 T1
WHERE EXISTS (
SELECT * FROM Table2 T2
WHERE T1.A=T2.A AND T1.B=T2.B AND T1.C=T2.C
AND T2.D=2
)
Razvan|||Thanks Steve and Razvan, this did the trick. Once I saw this I thought; "of
COURSE" (hand slap on forehead).
Thanks again.
"Razvan Socol" <rsocol@.gmail.com> wrote in message
news:1110347824.072514.140200@.z14g2000cwz.googlegroups.com...
> Hello, Scott
> This should do the job:
> SELECT * FROM Table1 T1
> WHERE EXISTS (
> SELECT * FROM Table2 T2
> WHERE T1.A=T2.A AND T1.B=T2.B AND T1.C=T2.C
> AND T2.D=2
> )
> Razvan
>

No comments:

Post a Comment