第一版:
http://blog.itpub.net/29254281/viewspace-2157111/
第二版:
http://blog.itpub.net/29254281/viewspace-2157209/
第二版用时 33秒左右.
在原来的基础上,稍加改进,即可提升三分之一的性能.
-
select query_time,d,max(ts) ts from (
-
select t2.query_time,ts,rn,round(rn/total,10) percent,
-
case
-
when 0.71>=round(rn/total,10) then 0.71
-
when 0.81>=round(rn/total,10) then 0.81
-
when 0.91>=round(rn/total,10) then 0.91
-
end d
-
from (
-
select query_time,ts,
-
case when @gid=query_time then @rn:=@rn+1 when @gid:=query_time then @rn:=1 end rn
-
from (
-
select * from t ,(select @gid:='',@rn:=0) vars order by query_time,ts
-
) t1
-
) t2 inner join (
-
select query_time,count(*) total from t group by query_time
-
) t3 on(t2.query_time=t3.query_time)
-
where round(rn/total,10)>=0.71
-
) t6
-
where d is not null
- group by query_time,d
where round(rn/total,10)>=0.71
即 用定义的最小的百分位数进行过滤后,再group by
此时 查询时间可以低至 20.531 s
当然,这个SQL还有进一步提升的空间
计算 某个百分位数的位置,有如下的公式:
loc=1+(n-1)*p,n是元素数,p是分位点。loc大小介于1和n之间
那么SQL可以进行如下优化
-
select t5.query_time,t5.ts,t2.v from (
-
select query_time,total,v, floor(1+(total-1)*v) rn
-
from (
-
select query_time,count(*) total from t group by query_time
-
) t3, (select 0.71 v,1 seq union all select 0.81,2 union all select 0.91,3) t4
-
)
-
t2 inner join (
-
select
-
query_time,
-
case when @gid=query_time then @rn:=@rn+1 when @gid:=query_time then @rn:=1 end rn,
-
ts
-
from (
-
select * from t ,(select @gid:='',@rn:=0) vars order by query_time,ts
-
) t1
- ) t5 on (t2.query_time=t5.query_time and t2.rn=t5.rn )
除了本身简化了SQL复杂度,查询时间也低至 15秒左右