How to Return Distinct Records Based on Two Different Values

Go To StackoverFlow.com

1

Have the following sample of data in an Oracle table called MY_ATTRIBUTES:

Date        Emp_No  Attribute
-----------------------------------------------------
01/04/2012  1234567 APPLE
01/04/2012  1234567 ORANGE
01/04/2012  1234567 PINEAPPLE
01/04/2012  1234567 BANANA
01/04/2012  8888888 APPLE
01/04/2012  8888888 ORANGE
01/04/2012  2222222 APPLE
01/04/2012  2222222 ORANGE
01/04/2012  2222222 PINEAPPLE

Based on the above sample data, I need to return only the distinct records using only Date and Emp_No where these records contains data that have both the Attributes of ‘APPLE’ and ‘PINEAPPLE’

So based on this criteria, I would expect the following two result records only, i.e.:

01/04/2012  1234567
01/04/2012  2222222

Hopefully someone can assist with an Oracle SQL query that will return this result set based on the criteria described.

Sorry, should have also pointed out that I will need this SQL query to be used as a sub-query off the main SELECT.

Thanks.

2012-04-04 06:13
by tonyf
your question contains the answer...when you distinct records use distinct function...for an example Select distinct from table where.... - chinna_82 2012-04-04 06:22
Understand what you are saying but I wasn't sure how to pull out the records where the attribute was both APPLE and PINEAPPLE - tonyf 2012-04-04 06:32
u can use IN function..or UNION..Select distinct date,emp_no from (select * from table where attribute = apple union select * from table where attribute = apple)... - chinna_82 2012-04-04 06:57


2

try

SELECT DISTINCT A.Date, A.Emp_no 
FROM (SELECT * FROM MY_ATTRIBUTES WHERE ATTRIBUTE = 'APPLE') A 
INNER JOIN (SELECT * FROM MY_ATTRIBUTES WHERE ATTRIBUTE = 'PINEAPPLE' ) B ON B.Date = A.Date AND B.Emp_No = A.Emp_No 
2012-04-04 06:18
by Yahia


2

There may be a more efficient way, but this should work:

SELECT DISTINCT DATE, EMP_NO
FROM   (
    SELECT DATE, EMP_NO
    FROM   MY_ATTRIBUTES
    WHERE  ATTRIBUTE = 'APPLE'
    INTERSECT
    SELECT DATE, EMP_NO
    FROM   MY_ATTRIBUTES
    WHERE  ATTRIBUTE = 'PINEAPPLE'
);
2012-04-04 06:18
by cagcowboy
this all works fine but I am now finding a performance issue doing it this way - any ideas on how to speed/tune the query - tonyf 2012-04-04 07:13
INTERSECT should provide a set of unique combinations, so there is no need for the DISTINCT clause which will incur an additional sort - APC 2012-04-04 09:14
@APC good catch(voted comment) but I think new Oracle optimizers know it, and won't do the sort twice. I don't have time to test it.. - Florin Ghita 2012-04-04 10:47


2

If there are no duplicates (ex. 2 APPLES on the same date and emp) this might be more efficient as there are no joins:

SELECT DATE, EMP_NO 
FROM MY_ATTRIBUTES
WHERE ATTRIBUTE = 'APPLE' OR  ATTRIBUTE = 'PINEAPPLE'
GROUP BY DATE, EMP_NO
HAVING COUNT(*) = 2
2012-04-04 06:24
by barsju
thanks barsju but I am actually using this query requirement as part of a sub-query and unfortunately am not able to use GROUP BY in a sub-query. Would appreciate if there was a workaround if used as a sub-query? Thanks - tonyf 2012-04-04 12:15
Well does it have to be a subquery? Maybe you should post the entire query so we can have a crack at that instead. - barsju 2012-04-04 13:43


2

A variant on a theme:

with cte as 
 ( select date, empno, attribute
   from my_attributes
   where attribute in ('PINEAPPLE', 'APPLE') )
select  *
from (  select date, emp_no
        from cte
        where attribute = 'PINEAPPLE'
        intersect
        select date, emp_no
        from cte
        where attribute = 'APPLE' )

The performance of any of these solutions will depend on the selectivity of ATTRIBUTE. If you only have half a dozen fruits in your bowl you're looking at a full table scain and there are only limited options for tuning those. On the other hand if the column is a greengrocer's full of fruity delights (say 150+ distinct values) then you should get return from an index. But that still depends on distribution and skew: you might not see any benefits from an index if ATTRIBUTE is 90% apples and pineapples: in that case a full table scan is still the better option.

tl;dr

Tuning is hard

2012-04-04 09:30
by APC


0

You can try the following:

SELECT DATE, EMP_NO FROM YOUR_TABLE WHERE ATTRIBUTE = 'APPLE' OR ATTRIBUTE = 'PINEAPPLE' GROUP BY DATE, EMP_NO HAVING COUNT(*) = 2;

2012-04-04 10:17
by harry
this is already answered - Florin Ghita 2012-04-04 10:43
@FlorinGhita answer is one can it be different from the actual answer? yes you may say it to style differentl - harry 2012-04-04 11:25
I don't understant what you are saying. Your answer is the same as barsju's. He already said what is in your response(4 hours before you), except the formatting. It is not useful to have two answers with the same thing - Florin Ghita 2012-04-04 13:01
u need to understand clearly that what answer i have stated that is a correct answer whether it matches with barsju's or with 100 people's answer that does not matter ok? and in stead of playing with me you should provide an answer to this question and get relaxed ok - harry 2012-04-04 13:21


-1

SELECT DISTINCT DATE, EMP_NO
WHERE  ATTRIBUTE IN('APPLE','PINEAPPLE')
2012-04-04 07:01
by Ritesh D
This will provide a result where the DATE, EMP_NO combination matches on either attribute. The OP requires only combinations which have records for both - APC 2012-04-04 09:16


-1

SELECT DISTINCT Emp_No, Date
FROM (
  SELECT *
  FROM MY_ATTRIBUTES m
  WHERE m.Attribute IN ('Apple','Pineapple')
  )
LIMIT 2
ORDER BY Emp_No
2012-04-04 07:02
by baym
This will provide a result where the DATE, EMP_NO combination matches on either attribute. The OP requires only combinations which have records for both. Also, LIMIT is not Oracle syntax but even if it were it would not produce the correct effect - APC 2012-04-04 09:17
Ads