3 votes

mysql : regroupement lorsqu'une valeur se trouve dans une plage de valeurs

J'ai cherché partout et je n'ai pas trouvé d'informations sur la façon de traiter ma demande. Je m'excuse par avance si ma question est stupide, mais j'ai vraiment besoin d'aide.

J'ai une série de valeurs qui sont enregistrées à différents intervalles. Les données ressemblent à ce qui suit :

 timeStamp           | RPM 
 2012-05-01 01:02:56 | 802
 2012-05-01 01:03:45 | 845
 2012-05-01 01:04:50 | 825
 2012-05-01 01:05:55 | 810
 2012-05-01 01:07:00 | 1000
 2012-05-01 01:08:03 | 1005
 2012-05-01 01:09:05 | 1145
 2012-05-01 01:10:15 | 1110
 2012-05-01 01:11:20 | 800
 2012-05-01 01:12:22 | 812
 2012-05-01 01:13:20 | 820
 2012-05-01 01:14:20 | 820
 2012-05-01 01:15:20 | 1200

Le RPM dans l'exemple est le RPM du moteur.

J'ai besoin de l'horodatage de début et de fin lorsque le régime est compris entre 800 et 900 tours/minute, ce qui est considéré comme le ralenti du moteur. J'aimerais également pouvoir renvoyer l'heure de début et de fin pour chaque période de non-ralenti.

Le résultat que j'essaie d'obtenir serait quelque chose comme :

Period    | startTime           | endTime             | duration 
 Idle1    | 2012-05-01 01:02:56 | 2012-05-01 01:05:55 | 179 seconds 
 nonIdle1 | 2012-05-01 01:07:00 | 2012-05-01 01:10:15 | 195 seconds 
 idle2    | 2012-05-01 01:11:20 | 2012-05-01 01:14:20 | 180 seconds 

Je vous remercie d'avance de votre aide.

Merci.

5voto

Michael Buen Points 20453

Essayez ça : http://www.sqlfiddle.com/#!2/e9372/1

L'avantage de le faire du côté de la base de données est que vous pouvez utiliser la requête non seulement en PHP, mais aussi en Java, C#, Python, etc. Et il est rapide de le faire du côté de la base de données.

select 
  if(idle_state = 1, 
       concat('Idle ', idle_count), 
       concat('NonIdle ', non_idle_count) ) as Period,
  startTime, endTime, duration
from
(

  select 

    @idle_count := @idle_count + if(idle_state = 1,1,0) as idle_count,
    @non_idle_count := @non_idle_count +if(idle_state = 0,1,0) as non_idle_count,

    state_group, idle_state,
    min(timeStamp) as startTime, max(timeStamp) as endTime,
    timestampdiff(second, min(timeStamp), max(timeStamp)) as duration
  from
  (
    select *,        
      @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
      @state_group := @state_group + 
                      if(@idle_state = @prev_state,0,1) as state_group,
      @prev_state := @idle_state
    from (tbl, (select @state_group := 0 as y) as vars)
    order by tbl.timeStamp
  ) as x
  ,(select @idle_count := 0 as y, @non_idle_count := 0 as z) as vars
  group by state_group, idle_state

) as summary

Sortie :

|    PERIOD |                  STARTTIME |                    ENDTIME | DURATION |
|-----------|----------------------------|----------------------------|----------|
|    Idle 1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
| NonIdle 1 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|    Idle 2 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
| NonIdle 2 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

Voir la progression de la requête ici : http://www.sqlfiddle.com/#!2/e9372/1


Comment cela fonctionne :

Cinq étapes.

Premièrement, séparez le ralenti du non-ralenti :

select *,
  @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state
from (tbl, (select @state_group := 0 as y) as vars)
order by tbl.timeStamp;

Sortie :

|                  TIMESTAMP |  RPM | Y | IDLE_STATE |
|----------------------------|------|---|------------|
| May, 01 2012 01:02:56-0700 |  802 | 0 |          1 |
| May, 01 2012 01:03:45-0700 |  845 | 0 |          1 |
| May, 01 2012 01:04:50-0700 |  825 | 0 |          1 |
| May, 01 2012 01:05:55-0700 |  810 | 0 |          1 |
| May, 01 2012 01:07:00-0700 | 1000 | 0 |          0 |
| May, 01 2012 01:08:03-0700 | 1005 | 0 |          0 |
| May, 01 2012 01:09:05-0700 | 1145 | 0 |          0 |
| May, 01 2012 01:10:15-0700 | 1110 | 0 |          0 |
| May, 01 2012 01:11:20-0700 |  800 | 0 |          1 |
| May, 01 2012 01:12:22-0700 |  812 | 0 |          1 |
| May, 01 2012 01:13:20-0700 |  820 | 0 |          1 |
| May, 01 2012 01:14:20-0700 |  820 | 0 |          1 |
| May, 01 2012 01:15:20-0700 | 1200 | 0 |          0 |

Deuxièmement, répartissez les changements en groupes :

select *,  
  @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
  @state_group := @state_group + 
                  if(@idle_state = @prev_state,0,1) as state_group,
  @prev_state := @idle_state

from (tbl, (select @state_group := 0 as y) as vars)
order by tbl.timeStamp;

Sortie :

|                  TIMESTAMP |  RPM | Y | IDLE_STATE | STATE_GROUP | @PREV_STATE := @IDLE_STATE |
|----------------------------|------|---|------------|-------------|----------------------------|
| May, 01 2012 01:02:56-0700 |  802 | 0 |          1 |           1 |                          1 |
| May, 01 2012 01:03:45-0700 |  845 | 0 |          1 |           1 |                          1 |
| May, 01 2012 01:04:50-0700 |  825 | 0 |          1 |           1 |                          1 |
| May, 01 2012 01:05:55-0700 |  810 | 0 |          1 |           1 |                          1 |
| May, 01 2012 01:07:00-0700 | 1000 | 0 |          0 |           2 |                          0 |
| May, 01 2012 01:08:03-0700 | 1005 | 0 |          0 |           2 |                          0 |
| May, 01 2012 01:09:05-0700 | 1145 | 0 |          0 |           2 |                          0 |
| May, 01 2012 01:10:15-0700 | 1110 | 0 |          0 |           2 |                          0 |
| May, 01 2012 01:11:20-0700 |  800 | 0 |          1 |           3 |                          1 |
| May, 01 2012 01:12:22-0700 |  812 | 0 |          1 |           3 |                          1 |
| May, 01 2012 01:13:20-0700 |  820 | 0 |          1 |           3 |                          1 |
| May, 01 2012 01:14:20-0700 |  820 | 0 |          1 |           3 |                          1 |
| May, 01 2012 01:15:20-0700 | 1200 | 0 |          0 |           4 |                          0 |

Troisièmement, les regrouper, et calculer la durée :

select 
  state_group, idle_state,
  min(timeStamp) as startTime, max(timeStamp) as endTime,
  timestampdiff(second, min(timeStamp), max(timeStamp)) as duration
from
(
  select *,    
    @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
    @state_group := @state_group + 
                    if(@idle_state = @prev_state,0,1) as state_group,
    @prev_state := @idle_state
  from (tbl, (select @state_group := 0 as y) as vars)
  order by tbl.timeStamp
) as x
group by state_group, idle_state;

Sortie :

| STATE_GROUP | IDLE_STATE |                  STARTTIME |                    ENDTIME | DURATION |
|-------------|------------|----------------------------|----------------------------|----------|
|           1 |          1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
|           2 |          0 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|           3 |          1 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
|           4 |          0 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

Quatrièmement, obtenez le compte de ralenti et de non-ralenti :

select 

  @idle_count := @idle_count + if(idle_state = 1,1,0) as idle_count,
  @non_idle_count := @non_idle_count + if(idle_state = 0,1,0) as non_idle_count,

  state_group, idle_state,
  min(timeStamp) as startTime, max(timeStamp) as endTime,
  timestampdiff(second, min(timeStamp), max(timeStamp)) as duration
from
(
  select *,        
    @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
    @state_group := @state_group + 
                    if(@idle_state = @prev_state,0,1) as state_group,
    @prev_state := @idle_state
  from (tbl, (select @state_group := 0 as y) as vars)
  order by tbl.timeStamp
) as x
,(select @idle_count := 0 as y, @non_idle_count := 0 as z) as vars
group by state_group, idle_state;

Sortie :

| IDLE_COUNT | NON_IDLE_COUNT | STATE_GROUP | IDLE_STATE |                  STARTTIME |                    ENDTIME | DURATION |
|------------|----------------|-------------|------------|----------------------------|----------------------------|----------|
|          1 |              0 |           1 |          1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
|          1 |              1 |           2 |          0 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|          2 |              1 |           3 |          1 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
|          2 |              2 |           4 |          0 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

Enfin, supprimez les variables de mise en scène :

select 
  if(idle_state = 1, 
       concat('Idle ', idle_count), 
       concat('NonIdle ', non_idle_count) ) as Period,
  startTime, endTime, duration
from
(

  select 

    @idle_count := @idle_count + if(idle_state = 1,1,0) as idle_count,
    @non_idle_count := @non_idle_count +if(idle_state = 0,1,0) as non_idle_count,

    state_group, idle_state,
    min(timeStamp) as startTime, max(timeStamp) as endTime,
    timestampdiff(second, min(timeStamp), max(timeStamp)) as duration
  from
  (
    select *,        
      @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
      @state_group := @state_group + 
                      if(@idle_state = @prev_state,0,1) as state_group,
      @prev_state := @idle_state
    from (tbl, (select @state_group := 0 as y) as vars)
    order by tbl.timeStamp
  ) as x
  ,(select @idle_count := 0 as y, @non_idle_count := 0 as z) as vars
  group by state_group, idle_state

) as summary

Sortie :

|    PERIOD |                  STARTTIME |                    ENDTIME | DURATION |
|-----------|----------------------------|----------------------------|----------|
|    Idle 1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
| NonIdle 1 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|    Idle 2 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
| NonIdle 2 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

Voir la progression des requêtes ici : http://www.sqlfiddle.com/#!2/e9372/1


UPDATE

La requête pourrait être raccourcie http://www.sqlfiddle.com/#!2/418cb/1

Si vous remarquez, les numéros de période viennent en tandem (idle-nonIdle, idle-nonIdle, et ainsi de suite). Vous pouvez simplement faire ça :

select 

  case when idle_state then
     concat('Idle ', @rn := @rn + 1) 
  else
     concat('Non-idle ', @rn )
  end as Period,

  min(timeStamp) as startTime, max(timeStamp) as endTime,

  timestampdiff(second, min(timeStamp), max(timeStamp)) as duration

from
(
  select *,        
  @idle_state := if(rpm between 800 and 900, 1, 0) as idle_state,
  @state_group := @state_group + if(@idle_state = @prev_state,0,1) as state_group,
  @prev_state := @idle_state
  from (tbl, (select @state_group := 0 as y) as vars)
  order by tbl.timeStamp
) as x,
(select @rn := 0) as rx
group by state_group, idle_state

Sortie :

|     PERIOD |                  STARTTIME |                    ENDTIME | DURATION |
|------------|----------------------------|----------------------------|----------|
|     Idle 1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
| Non-idle 1 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|     Idle 2 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
| Non-idle 2 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

1voto

Michael Buen Points 20453

Cela fonctionne sur tous les SGBDR supportant le fenêtrage. http://www.sqlfiddle.com/#!1/320e4/1

Le temps d'extraction varie selon le SGBDR.

with a as
(
  select *, (rpm between 800 and 900) as idle_state,
        case when 
            (rpm between 800 and 900) =
            (lag(rpm) over(order by timestamp) between 800 and 900) then 0
        else 
            1
        end as detect_leader
  from tbl   
)
,grp as
(
  select *, sum(detect_leader) over(order by timeStamp) as state_group
  from a
)
,rn as 
(
  select state_group, idle_state,

    min(timeStamp) as startTime, max(timeStamp) as endTime,  
    extract(epoch from ( max(timeStamp) - min(timeStamp) ) ) as duration

  from grp
  group by state_group, idle_state
  order by state_group
)
select 
   case when idle_state then 'Idle ' else 'Non-idle ' end 
      || (row_number() over(order by state_group) + 1) / 2  as Period,

   rn.startTime, rn.endTime, rn.duration
from rn;

Sortie :

|     PERIOD |                  STARTTIME |                    ENDTIME | DURATION |
|------------|----------------------------|----------------------------|----------|
|     Idle 1 | May, 01 2012 01:02:56-0700 | May, 01 2012 01:05:55-0700 |      179 |
| Non-idle 1 | May, 01 2012 01:07:00-0700 | May, 01 2012 01:10:15-0700 |      195 |
|     Idle 2 | May, 01 2012 01:11:20-0700 | May, 01 2012 01:14:20-0700 |      180 |
| Non-idle 2 | May, 01 2012 01:15:20-0700 | May, 01 2012 01:15:20-0700 |        0 |

Voir la progression des requêtes ici : http://www.sqlfiddle.com/#!1/320e4/1

Prograide.com

Prograide est une communauté de développeurs qui cherche à élargir la connaissance de la programmation au-delà de l'anglais.
Pour cela nous avons les plus grands doutes résolus en français et vous pouvez aussi poser vos propres questions ou résoudre celles des autres.

Powered by:

X