SQL Puzzle - Issue Tracker - with Solution Approach

In past I posted some SQL Puzzle problems and their solutions. I got some suggestion that with solutions I should also describe the approach. So with this post I am trying to describe how I solved a problem.

Let's look at the problem first.

Imagine we have following table for tracking issues.
create table issue_tracker
(
  id           number(10) primary key,
  issue_id     number(10), 
  assigned_to  number(10), 
  status       char(1),
  issue_date   date
);

Data for our "issue_tracker" table is as following.
insert into issue_tracker values(1, 1001, 1234, 'S', to_date('01-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(3, 1001, 1234, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(4, 1001, 1234, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(8, 1001, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(12, 1001, 1234, 'E', to_date('03-Jan-2015','dd-Mon-yyyy'));

insert into issue_tracker values(2, 1002, 2345, 'S', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(5, 1002, 2345, 'U', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(7, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(13, 1002, 2345, 'E', to_date('03-Jan-2015','dd-Mon-yyyy'));

insert into issue_tracker values(6, 1003, 1234, 'S', to_date('02-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(14, 1003, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(15, 1003, 1234, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(20, 1003, 1234, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(23, 1003, 1234, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(27, 1003, 1234, 'E', to_date('05-Jan-2015','dd-Mon-yyyy'));

insert into issue_tracker values(16, 1002, 2345, 'S', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(17, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(18, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(19, 1002, 2345, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(24, 1002, 2345, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(25, 1002, 2345, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(26, 1002, 2345, 'E', to_date('04-Jan-2015','dd-Mon-yyyy'));

insert into issue_tracker values(9, 1004, 1544, 'S', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(10, 1004, 1544, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(11, 1004, 1544, 'U', to_date('03-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(21, 1004, 1544, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));
insert into issue_tracker values(22, 1004, 1544, 'U', to_date('04-Jan-2015','dd-Mon-yyyy'));

STATUS column may have 3 values: S (Start), U (Updates) and E (End). One thing to notice here is that issue 1002 was closed and was reopened again later.

So now here is the problem statement. You need to Find the START_DATE and END_DATE of each ISSUE_ID with number of updates happened. If an issue is reopened, it should have multiple records accordingly. Following is the sample output desired as per above data.

  ISSUE_ID ASSIGNED_TO START_DATE    UPDATES END_DATE 
---------- ----------- ---------- ---------- ---------
      1001        1234 01-JAN-15           3 03-JAN-15
      1002        2345 02-JAN-15           2 03-JAN-15
      1003        1234 02-JAN-15           4 05-JAN-15
      1002        2345 03-JAN-15           5 04-JAN-15
      1004        1544 03-JAN-15           4        


SOLUTION: To resolve this problem, we need to somehow divide rows in multiple groups so that each row of same issue starting from status 'S' to 'E' falls in same group. So I valued STATUS='S' as 1 (one) and used running total logic for this. Let's check first how I grouped each row for each issue.

SELECT 
     issue_id,
     assigned_to,
     status,
     issue_date,
     case when status = 'S' then 1 else 0 end start_issue,
     sum(case when status = 'S' then 1 else 0 end) over (partition by issue_id order by id) running_total
FROM issue_tracker
order by issue_id, id 

  ISSUE_ID ASSIGNED_TO STATUS ISSUE_DATE START_ISSUE RUNNING_TOTAL
---------- ----------- ------ ---------- ----------- -------------
      1001        1234 S      01-JAN-15            1             1
      1001        1234 U      02-JAN-15            0             1
      1001        1234 U      02-JAN-15            0             1
      1001        1234 U      03-JAN-15            0             1
      1001        1234 E      03-JAN-15            0             1
      1002        2345 S      02-JAN-15            1             1
      1002        2345 U      02-JAN-15            0             1
      1002        2345 U      03-JAN-15            0             1
      1002        2345 E      03-JAN-15            0             1
      1002        2345 S      03-JAN-15            1             2
      1002        2345 U      03-JAN-15            0             2
      1002        2345 U      03-JAN-15            0             2
      1002        2345 U      03-JAN-15            0             2
      1002        2345 U      04-JAN-15            0             2
      1002        2345 U      04-JAN-15            0             2
      1002        2345 E      04-JAN-15            0             2
      1003        1234 S      02-JAN-15            1             1
      1003        1234 U      03-JAN-15            0             1
      1003        1234 U      03-JAN-15            0             1
      1003        1234 U      04-JAN-15            0             1
      1003        1234 U      04-JAN-15            0             1
      1003        1234 E      05-JAN-15            0             1
      1004        1544 S      03-JAN-15            1             1
      1004        1544 U      03-JAN-15            0             1
      1004        1544 U      03-JAN-15            0             1
      1004        1544 U      04-JAN-15            0             1
      1004        1544 U      04-JAN-15            0             1

Looks good!!! Now as you can see that we can easily divide above to 5 groups (using ISSUE_ID with RUNNING_TOTAL). Now we just need to find START_DATE, END_DATE and COUNT of UPDATE on each group. Complete solution can be done by following query.

  SELECT issue_id,
         assigned_to,
         MIN (DECODE (status, 'S', issue_date)) start_date,
         COUNT (DECODE (status, 'U', issue_date)) UPDATES,
         MIN (DECODE (status, 'E', issue_date)) end_date
    FROM (SELECT issue_id,
                 assigned_to,
                 status,
                 issue_date,
                 sum(case when status = 'S' then 1 else 0 end) over (partition by issue_id order by id) grp
            FROM issue_tracker)
GROUP BY issue_id, assigned_to, grp
ORDER BY start_date, issue_id;
      
      
  ISSUE_ID ASSIGNED_TO START_DATE    UPDATES END_DATE 
---------- ----------- ---------- ---------- ---------
      1001        1234 01-JAN-15           3 03-JAN-15
      1002        2345 02-JAN-15           2 03-JAN-15
      1003        1234 02-JAN-15           4 05-JAN-15
      1002        2345 03-JAN-15           5 04-JAN-15
      1004        1544 03-JAN-15           4          
5 rows selected.

Quite simple. I hope you have enjoyed this approach on this problem which might help you in similar problems. If you have any other approach or any comments please do post in comment area.

More SQL Puzzles:
- SQL Puzzle - Transpose Rows and Shift Values among columns
- Graph Shortest Path Solution by SQL
- Diamond Shaped Star Pattern by SQL
- Sorting Versions stored in Varchar2 Column
- SQL Puzzle - Grouping Deals
- SQL Puzzle - Consecutive Wins
- SQL Interview Question Answers

10 comments:

  1. Used properly, analytics functions rocks !

    ReplyDelete
  2. The interesting case than we don't have this status mark (S/U/E). Will you interested in solving this case?

    ReplyDelete
    Replies
    1. Tabibitosan method could be useful in your case
      https://community.oracle.com/thread/1007478?tstart=0

      Delete
  3. Good job!
    A note:
    Probably I would always used a CASE statement instead a DECODE function: ;-)
    SELECT issue_id
    , assigned_to
    , MIN(CASE WHEN status = 'S' THEN issue_date END) start_date
    , COUNT(CASE WHEN status = 'U' THEN issue_date END) updates
    , MIN(CASE WHEN status = 'E' THEN issue_date END) end_date
    FROM (SELECT issue_id
    , assigned_to
    , status
    , issue_date
    , CASE WHEN status = 'S' THEN 1 ELSE 0 END start_issue
    , SUM(CASE WHEN status = 'S' THEN 1 ELSE 0 END) OVER (PARTITION BY issue_id ORDER BY id) grp
    FROM issue_tracker
    )
    GROUP BY issue_id, assigned_to, grp
    ORDER BY start_date, issue_id;

    ReplyDelete
  4. Super Sir, Thanks for your Post.. Great Work.

    ReplyDelete
  5. Hi Nimish,

    thanks for the solution..

    Below solution is quite long one -

    select
    issue_id
    ,assigned_to
    ,start_date
    ,updates
    ,end_date
    from (
    select
    z.*,
    min(issue_date) over (partition by frther_grp) START_DATE,
    max(case when status != 'E' then null else issue_date end ) over (partition by frther_grp) END_DATE,
    count(case when status = 'U' then 1 end) over
    (partition by frther_grp ) UPDATES,
    row_number() over (partition by frther_grp order by issue_id, id, issue_date, status) rwnm
    from
    (SELECT y.* ,
    max (grp) over (order by issue_id, id, issue_date, status) frther_grp
    FROM (SELECT x.*
    ,CASE
    WHEN status = 'S' THEN
    row_number() over (ORDER BY 2, 1, 5, 4)
    END grp
    FROM (SELECT id
    ,issue_id
    ,assigned_to
    ,status
    ,issue_date
    FROM issue_tracker
    -- WHERE issue_id in (1004, 1001)
    ORDER BY 2
    ,1
    ,5
    ,4) x) y)z
    )
    where rwnm =1
    order by start_date

    I think there is one another method by which it can be solved i.e. using the match_pattern introduced in 12c

    Regards,
    Amarjot





    ReplyDelete
  6. Id ,Max(rank() over (partition by I'd order by start_date )) ,min(start_date) group by I'd .can we do like that ?

    ReplyDelete
  7. It is almost the same as the model answer
    I will contribute as a result of thinking hard.
    I did not imitate it.
    m(_ _)m

    SELECT ISSUE_ID,ASSIGNED_TO
    ,MIN(CASE WHEN STATUS = 'S' THEN ISSUE_DATE END) START_DATE
    ,SUM(CASE WHEN STATUS = 'U' THEN 1 ELSE 0 END) UPDATES
    ,MAX(CASE WHEN STATUS = 'E' THEN ISSUE_DATE END) END_DATE
    FROM (
    SELECT ID,ISSUE_ID,ASSIGNED_TO,STATUS,ISSUE_DATE
    ,SUM(CASE WHEN STATUS = 'S' THEN 1 ELSE 0 END)OVER(ORDER BY ISSUE_ID,ASSIGNED_TO,ID) WILLGRP
    FROM ISSUE_TRACKER)
    GROUP BY ISSUE_ID,ASSIGNED_TO,WILLGRP
    ORDER BY MIN(ISSUE_DATE),ISSUE_ID

    ReplyDelete