2 votes

Boucle sur la date minimale d'un membre - Problème de performance

J'essaie de boucler chaque fois une date pour chaque membre et de définir un statut pour un membre. Cette requête fonctionne parfaitement mais prend beaucoup de temps. Je ne vois pas d'autre moyen d'obtenir cet ensemble de résultats.

Logique,

For every member, look for minimum date, then add 30 to it, if any date falls under  30, then status is initial_1. If it is greater than 30 and less than 30+15, then Reauth_1_1, if date greater than 30+16 , then it is intial_2.

MemberID    DOS         Status
HHH00031200 7/17/2014   Initial_1
HHH00031200 7/29/2014   Initial_1
HHH00031200 8/21/2014   Re-Auth_1_1
HHH00031200 8/27/2014   Re-Auth_1_1
HHH50000000 5/23/2016   Initial_1
HHH50000000 7/19/2016   Initial_2
HHH50000000 9/13/2016   Initial_3
HHH88844900 9/19/2015   Initial_1
HHH88844900 10/22/2015  Re-Auth_1_1
HHH88844900 11/24/2015  Re-Auth_1_2
HHH88844900 12/10/2015  Re-Auth_1_2
HHH22227700 1/16/2014   Initial_1
HHH22227700 2/21/2014   Re-Auth_1_1
HHH22227700 2/25/2014   Re-Auth_1_1
HHH22227700 3/5/2014    Re-Auth_1_1
HHH22227700 1/1/2015    Initial_2
HHH22227700 1/15/2015   Initial_2
HHH22227700 1/20/2015   Initial_2
HHH22227700 2/10/2015   Re-Auth_2_1
HHH22227700 2/12/2015   Re-Auth_2_1
HHH22227700 2/17/2015   Re-Auth_2_1
HHH22227700 2/19/2015   Re-Auth_2_1
HHH22227700 2/25/2015   Re-Auth_2_1
HHH22227700 2/26/2015   Re-Auth_2_1

Requête :

--drop table #Temp_SO_Check
Select
    *
Into #Temp_SO_Check
From #auth;

CREATE  INDEX IDX_C_Users_UserID ON #Temp_SO_Check(MemberID,DOS,LoopLogic);

While Exists
(
    Select Top 1
        MemberID
    From    #Temp_SO_Check
    Where   LoopLogic Is Null
)
Begin
    Select Top 1
        @ID = MemberID
        , @StartDate = DOS
    From    #Temp_SO_Check
    Where   LoopLogic Is Null
    Order By
        MemberID
        , DOS;

    If @PrevID <> @ID
    Begin
        set @Flag=1;
        Set @LoopLogic ='Initial_'+Cast(@Flag as nvarchar(50)) ;
        Set @PrevID = @ID;
        set @ReauthFlag=0;
    end 
    else 
    Begin
        if((@LookupDate is not null) and (datediff(day,@LookupDate,@StartDate))<14)
        Begin       
            Set @LoopLogic ='Re-Auth_'+Cast(@Flag as nvarchar(50)) ;
        End
        Else if ((@LookupDate is not null) and (datediff(day,@LookupDate,@StartDate))>14)
        Begin
            set @Flag=@Flag+1;
            set @ReauthFlag=0;
            Set @LoopLogic ='Initial_'+Cast(@Flag as nvarchar(50));
        End
    End

    Set @LookupDate = DateAdd(Day, 30, @StartDate);
    if( (@LoopLogic like '%Re-Auth_%') and (@LookupDate<>@LookupDate1))
    Begin
        Set @ReauthFlag=@ReauthFlag+1;
        Set @LoopLogic ='Re-Auth_'+Cast(@Flag as nvarchar(50))+'_'+Cast(@ReauthFlag as nvarchar(50)) ;
    End

    set @LookupDate1=@LookupDate;

    Update
        #Temp_SO_Check
    Set
        LoopLogic = @LoopLogic
    Where
        MemberID = @ID
        And DOS Between @StartDate
                    And     @LookupDate;

End;

La table #auth contient mes données et ensuite les statistiques en boucle. J'ai 766000 enregistrements et cela prend plus d'une heure et demie et fonctionne toujours.

Quelqu'un peut-il m'aider à affiner cette requête ?

0voto

scsimon Points 19203

Vous n'avez pas besoin de boucler du tout... et les boucles dans le serveur SQL sont des tueurs de performance. Voici une solution simple

--determine the minimum date for the table, and add 30 days to it
declare @minDate date = (select dateadd(day,30,min(DOS)) DT from YourTable)

--based on your logic in the question, update the status column
update YouTable
set [Status] = case 
                when DOS < @minDate then 'initial_1'
                when DOS > @minDate and DOS < dateadd(day,15,@minDate) then 'Reauth_1_1'
                else 'intial_2'
               end

Si la date minimale s'applique à chaque employé, je la traiterais dans un CTE.

;with cte as(
select
    MemberID    
    ,DOS         
    ,[Status]
    ,MinDOS = dateadd(day,30,min(DOS) over (partition by MemberID))
from YourTable)

update cte
set [Status] = case 
                when DOS < MinDOS then 'initial_1'
                when DOS > MinDOS and DOS < dateadd(day,15,MinDOS) then 'Reauth_1_1'
                else 'intial_2'
               end

0voto

chits Points 132

Je ne savais pas comment copier cela dans les commentaires et j'ai pensé à répondre à la question. Veuillez me corriger s'il y a une meilleure façon de publier les détails.

Voici l'exigence :

1.For any member, look for minimum DOS, now add +30 to this date. 
2. As a first step we need to slice the data by 30 days window.
3.For example, minimum DOS for  memberid HHH00031200 is 7/17/2014. 
4. Add 30 days to it and it is 8/16/2014. 
5. Now for any date between 7/17/2014-8/16/2014 the status is Initial_1(This is the first window).
6.Now again add + 30 days to 8/16/2018. 
7.But in next window, if any date is within 15 days(i.e.8/16/2014+ 15 days) , it will be Reauth_1_1.if the date is > 15 days, then Initial_2 and this goes on till we loop all the records.
8.For example, Let us calculate the window for HHH88844900.
i. First Window-09/19/2015+30=10/19/2015
ii. Second Window-10/19/2015+30=11/21/2015
iii.third Window-11/21/2015+30=12/24/2015
9.First Loop Logic is always Initial_1.
10.Second Date is 10/22/2015 which is 3 days after the first window and that is why it is Re-Auth_1_1.
11.Third Date is 11/24/2015 which is 3 days after second window and that is why it is Re-Auth_1_2. similarlly last one. 
12.If the dates were > 15 days from the 30 days window, it would be Initial_2 ,for example MemeberId HHH50000000.
13.For memberId HHH22227700 , once we have the Initial_2, then the next dates would be Re-Auth_2_1 and so on.. 

Merci pour votre aide.

0voto

Joe Farrell Points 3172

C'est une question très intéressante. À l'origine, j'espérais trouver une solution en utilisant des fonctions de fenêtre telles que lag() Mais comme il n'y a pas de bon moyen de savoir combien d'enregistrements séparent un enregistrement donné de l'enregistrement précédent qui représente le début du même intervalle, j'ai fini par devoir utiliser la récursivité. Je ne suis pas sûr que cela fonctionnera pour vous sur un très grand ensemble de données, et si vous avez un grand nombre d'entrées par MemberID il se peut que vous deviez ajouter un MAXRECURSION indice de requête mais j'espère qu'elle sera au moins plus performante qu'une solution itérative. Les commentaires de la requête expliquent ce qui se passe.

-- Sample data from the question.
declare @SampleData table (MemberID varchar(32), DOS date);
insert @SampleData values
    ('HHH00031200', '20140717'),
    ('HHH00031200', '20140729'),
    ('HHH00031200', '20140821'),
    ('HHH00031200', '20140827'),
    ('HHH50000000', '20160523'),
    ('HHH50000000', '20160719'),
    ('HHH50000000', '20160913'),
    ('HHH88844900', '20150919'),
    ('HHH88844900', '20151022'),
    ('HHH88844900', '20151124'),
    ('HHH88844900', '20151210'),
    ('HHH22227700', '20140116'),
    ('HHH22227700', '20140221'),
    ('HHH22227700', '20140225'),
    ('HHH22227700', '20140305'),
    ('HHH22227700', '20150101'),
    ('HHH22227700', '20150115'),
    ('HHH22227700', '20150120'),
    ('HHH22227700', '20150210'),
    ('HHH22227700', '20150212'),
    ('HHH22227700', '20150217'),
    ('HHH22227700', '20150219'),
    ('HHH22227700', '20150225'),
    ('HHH22227700', '20150226');

-- First, assign every record from our data set a sequence number, where the
-- record having the earliest DOS for any MemberID has [Sequence] = 1 and the
-- value of [Sequence] increases by 1 for each subsequent DOS.
with OrderedDataCTE as
(
    select
        S.MemberID,
        S.DOS,
        [Sequence] = row_number() over (partition by S.MemberID order by S.DOS)
    from
        @SampleData S
),

-- Here's the recursive CTE, where the real work is done. Its purpose is to
-- split the initial data set into intervals. I use the term "interval" to mean
-- a set of records that have the same MemberID and the same status.
IntervalCTE as
(
    -- Base case: Any record with [Sequence] = 1 is the first record for its
    -- MemberID, so we know this represents the beginning of a new interval
    -- that will ultimately have the status Initial_1.
    select
        O.MemberID,
        O.DOS,
        O.[Sequence],

        -- IntervalStartDate will be the DOS of the record that defines the
        -- start of the current interval. Since every record in the base case
        -- marks a new interval, this is always the DOS.
        IntervalStartDate = O.DOS,

        -- Given that our final record statuses will either be of the form
        -- Initial_X or Re-Auth_X_Y, AuthNumber will represent the value X. At
        -- the start of the first interval for a new MemberID, the value is
        -- always 1, because every member's first status is Initial_1.
        AuthNumber = 1,

        -- ReAuthNumber will represent the value Y (see comments above), and
        -- will be set to 0 for statuses of the form Initial_X. At the start of
        -- the first interval for a new MemberID, the value is always 0.
        ReAuthNumber = 0
    from
        OrderedDataCTE O
    where
        O.[Sequence] = 1

    union all

    -- Recursive case: For each MemberID we find the record in the original
    -- data set having the next-highest [Sequence] number, and compare it to
    -- the record for that MemberID with the preceding [Sequence] number in
    -- order to figure out its status.
    select
        This.MemberID,
        This.DOS,
        This.[Sequence],

        -- If this record occurs within 30 days of the start of the interval
        -- that includes the preceding record, then this record is part of the
        -- same interval and thus has the same interval start date. Otherwise,
        -- this record marks the beginning of a new interval, which starts on
        -- this record's DOS.
        IntervalStartDate = case when datediff(day, Prev.IntervalStartDate, This.DOS) < 30 then Prev.IntervalStartDate else This.DOS end,

        -- If this record occurs within 45 days of the start of the interval
        -- that includes the preceding record, then its AuthNumber is the same
        -- as the preceding record. Otherwise it increments by 1.
        AuthNumber = case when datediff(day, Prev.IntervalStartDate, This.DOS) < 45 then Prev.AuthNumber else Prev.AuthNumber + 1 end,

        -- If this record occurs within 30 days of the start of the interval
        -- that includes the preceding record, then this record is part of the
        -- same interval and thus has the same ReAuthNumber.
        --
        -- If the above is not true but the record occurs within 45 days of the
        -- start of the interval that includes the preceding record, then the
        -- ReAuthNumber increments by 1.
        --
        -- If neither of the above is true, then we know that the AuthNumber
        -- will have increased (above), so the ReAuthNumber resets to 0.
        ReAuthNumber = 
            case
                when datediff(day, Prev.IntervalStartDate, This.DOS) < 30 then Prev.ReAuthNumber 
                when datediff(day, Prev.IntervalStartDate, This.DOS) < 45 then Prev.ReAuthNumber + 1
                else 0
            end
    from
        IntervalCTE Prev
        inner join OrderedDataCTE This on
            Prev.MemberID = This.MemberID and
            Prev.[Sequence] = This.[Sequence] - 1
)

-- At this point we can trivially construct each record's status from the
-- AuthNumbers and ReAuthNumbers assigned above.
select
    I.MemberID,
    I.DOS,
    [Status] =
        case
            when I.ReAuthNumber = 0 then 'Initial_' + convert(varchar, I.AuthNumber)
            else 'Re-Auth_' + convert(varchar, I.AuthNumber) + '_' + convert(varchar, I.ReAuthNumber)
        end
from
    IntervalCTE I
order by
    I.MemberID,
    I.[Sequence];

Résultats :

MemberID      DOS          Status
------------------------------------------------
HHH00031200   2014-07-17   Initial_1
HHH00031200   2014-07-29   Initial_1
HHH00031200   2014-08-21   Re-Auth_1_1
HHH00031200   2014-08-27   Re-Auth_1_1
HHH22227700   2014-01-16   Initial_1
HHH22227700   2014-02-21   Re-Auth_1_1
HHH22227700   2014-02-25   Re-Auth_1_1
HHH22227700   2014-03-05   Re-Auth_1_1
HHH22227700   2015-01-01   Initial_2
HHH22227700   2015-01-15   Initial_2
HHH22227700   2015-01-20   Initial_2
HHH22227700   2015-02-10   Re-Auth_2_1
HHH22227700   2015-02-12   Re-Auth_2_1
HHH22227700   2015-02-17   Re-Auth_2_1
HHH22227700   2015-02-19   Re-Auth_2_1
HHH22227700   2015-02-25   Re-Auth_2_1
HHH22227700   2015-02-26   Re-Auth_2_1
HHH50000000   2016-05-23   Initial_1
HHH50000000   2016-07-19   Initial_2
HHH50000000   2016-09-13   Initial_3
HHH88844900   2015-09-19   Initial_1
HHH88844900   2015-10-22   Re-Auth_1_1
HHH88844900   2015-11-24   Re-Auth_1_2
HHH88844900   2015-12-10   Re-Auth_1_2

Prograide.com

Prograide est une communauté de développeurs qui cherche à élargir la connaissance de la programmation au-delà de l'anglais.
Pour cela nous avons les plus grands doutes résolus en français et vous pouvez aussi poser vos propres questions ou résoudre celles des autres.

Powered by:

X