排名

用户解题统计

过去一年提交了

勋章 ①金银铜:在竞赛中获得第一二三名;②好习惯:自然月10天提交;③里程碑:解决1/2/5/10/20/50/100/200题;④每周打卡挑战:完成每周5题,每年1月1日清零。

收藏

收藏日期 题目名称 解决状态
2025-05-23 夜深与专车比例的相关系数  已解决
2025-05-23 晚时段专车比例  已解决
2025-05-23 找出所有类别组合的最热门路线  已解决
2025-05-23 查询所有起点和终点都属于餐饮类别的行程  已解决
2025-05-19 滴滴面试真题(4)未完成订单在第二天继续呼叫的比例  已解决
2025-05-19 滴滴面试真题(3)UTC转化后的本地时间呼叫高峰期  已解决
2025-05-18 会员与非会员的日均观看视频数量  已解决
2025-05-18 每个视频类型的T+3留存率  已解决
2025-05-15 钢铁直男的观看记录  已解决
2025-05-13 哔哩哔哩面试真题(2)计算春节周会员收入  已解决
2025-05-12 各地区包含“中国”或“中”字开头的公司数量及比例  已解决
2025-05-09 小丑竟是我自己  已解决
2025-05-09 海王:1小时发3个红包  已解决
2025-05-09 计算每个用户的受欢迎程度  已解决
2025-05-09 计算每个用户的受欢迎程度  已解决
2025-05-08 统计各省红包金额的总金额、平均数、中位数、众数、标准差以及75分位数  已解决
2025-05-06 红包金额中位数  已解决
2025-05-06 比较男女收发红包的差异  已解决
2025-05-06 餐厅的客单价增长趋势  已解决
2025-05-06 得物面试真题(1)每周iPhone用户和非iPhone用户  已解决
2025-05-06 整体的点击率  已解决
2025-05-06 互相发过红包的好友关系对  已解决
2025-05-04 各金额区间红包拒收率  已解决
2025-05-04 连续登录3天及以上  已解决
2025-05-04 抖音面试真题(4)T+1月留存  已解决
2025-05-04 抖音面试真题(5)新用户的T+1月留存  已解决
2025-05-04 抖音面试真题(3)一次性计算T+3、T+7、T+14留存率(2)  已解决
2025-05-04 抖音面试真题(1)T+1日留存率  已解决
2025-05-04 10月1日后再也没活跃过的用户  已解决
2025-05-04 全量用户标签表  已解决
2025-05-03 德州扑克起手牌-同花两高张  已解决
2025-05-03 从商品角度统计收藏到购买的转化率  未解决
2025-04-29 给商品打四类标签(列)  已解决
2025-04-29 给商品打四类标签(行)  已解决
2025-04-28 对各品牌购买贡献度最高的三个关键词  已解决
2025-04-27 深圳气温异常年份  已解决
2025-04-26 只买iPhone的用户  已解决
2025-04-26 好友步数排名-考虑反向好友关系  已解决
2025-04-25 餐厅的用户复购率  已解决
2025-04-22 每日新增用户  已解决
2025-04-16 上月活跃用户数  已解决
2025-02-16 抖音面试真题(2)一次性计算T+1、T+3、T+7、T+14留存率  已解决
2025-02-16 好友步数排名-不考虑反向好友关系  已解决
2025-02-11 数学成绩分段统计(1)  已解决
2025-02-11 小结-行转列,展开学生成绩(1)  已解决
2025-02-11 HAVING-每次成绩都不低于80分的学生  已解决
2025-02-10 优异物理成绩的分布  已解决
2025-02-10 S1年级物理成绩前10名(1)  已解决
2025-02-10 大结局(😊)渣男9月爽翻天,罪证送他去西天  已解决
2025-02-10 小结(2)越花越多是死罪,按月统计Substr  已解决
2025-02-06 窗口函数(5)越来越喜欢召妓,窗口函数用累计(3)  已解决
2025-02-06 窗口函数(4)越来越喜欢召妓,窗口函数用累计(2)  已解决
2025-02-06 窗口函数(3)越来越喜欢召妓,窗口函数用累计(1)  已解决
2025-02-06 窗口函数(2)所有前一和每类前一,继续搞懂排序窗口函数  已解决
2025-02-03 窗口函数(1)年度前三和每月前三,搞懂排序窗口函数  已解决
2025-02-03 时间日期(4)阶段综合-按月统计日花费,一天都不要浪费  已解决

评论笔记

评论日期 题目名称 评论内容 站长评论
2025-05-23 查询所有起点和终点都属于餐饮类别的行程 
我一开始想着用exist,可能是没掌握具体语法,导致没输出出来,后面懂逻辑了;想问一下,不关注代码量,只关注性能的话,exist是不是比join的消耗资源要少很多啊
SELECT 
    t1.*
FROM 
    didi_sht_rcd t1
WHERE 
    EXISTS (
        SELECT 1 
        FROM loc_nm_ctg t2 
        WHERE t2.loc_nm = t1.start_loc AND t2.loc_ctg = '餐饮'
    )
    AND 
    EXISTS (
        SELECT 1 
        FROM loc_nm_ctg t3 
        WHERE t3.loc_nm = t1.end_loc AND t3.loc_ctg = '餐饮'
    )
牛啊。这题居然能想到用exists来解答。恭喜能力又上升了一个层次。三个原因决定了这题用exists也合适。​​
1、只需判断存在性,无需返回关联表字段​​
目标是根据地址分类过滤行程,不需要获取 loc_nm_ctg 表中的其他字段,EXISTS 更符合语义。
​​2、潜在性能优势​​
如果 loc_nm_ctg.loc_nm 有索引,EXISTS 在找到第一个匹配项后立即停止扫描,效率更高。
JOIN 需要两次表连接(起点和终点各一次),可能生成中间结果集,增加计算开销。
​​3、避免重复数据​​
JOIN 可能因多对多关系产生重复行(如一个地址名对应多个分类),需额外使用 DISTINCT;而 EXISTS 天然去重。
2025-05-18 会员与非会员的日均观看视频数量 
我现在对这个日均观看视频数量的计算逻辑还是不太懂,第一个是算总的观看次数/天数,第二种事先算每天的平均,然后在算总平均
这两个有区别。所以题干标注了,是使用 COUNT 统计每个用户每天的观看视频数量,再使用 AVG 计算日均观看视频数量。

类似的评价计算,都需要有清晰的定义。
2025-05-18 多类别观看指数计算 
我这个表格跟输出示例是一样的呀,答案里面没有保留2位小数
with total_typ_cnt as
(select
	v_typ
    ,count(distinct usr_id) as total_user
from
	bilibili_t20 as bt2
    join
    bilibili_t3 as bt3
    on
    bt2.v_id=bt3.v_id
group by
	1),
usr_see_typ as (
	select
  		usr_id
  		,count(distinct v_typ) as typ_cnt
	from
  		bilibili_t20 as bt2
  		join
  		bilibili_t3 as bt3
  		on 
  		bt2.v_id=bt3.v_id
  	group by
  		1
)
select
    t.v_typ,
    count(distinct case when typ_cnt >= 3 then u.usr_id end) as multi_category_viewers
    , t.total_user as total_viewers,
    round(count(distinct case when typ_cnt >= 3 then u.usr_id end)/ t.total_user*100,2) as mcv_index
from
    bilibili_t20 as bt2
    join bilibili_t3 as bt3 on bt2.v_id = bt3.v_id
    join usr_see_typ as u on bt2.usr_id = u.usr_id
    join total_typ_cnt as t on bt3.v_typ = t.v_typ
group by
    t.v_typ, t.total_user
order by
	4 desc
嗯,以你的为准。
2025-05-15 哔哩哔哩面试真题(6)全年会员收入分类 
老师我觉得按照时间来划分梯队比较符合业务逻辑,但是最后计算出来的数字和答案不一样,我也不知道为什么
with userpurchase as
(select
	user_id
    ,begin_date
    ,end_date
    ,pay_amount
    ,datediff(end_date,begin_date)+1 as duration
    ,case when datediff(end_date,begin_date)+1 between 20 and 32 then '月卡'
    	  when datediff(end_date,begin_date)+1 between 80 and 99 then '季卡'
          when datediff(end_date,begin_date)+1 between 170 and 190 then '半年卡'
          when datediff(end_date,begin_date)+1 >200 then '年卡'
          end  as typ
from
	bilibili_m1),
time_2021 as (
	select
  		m_date
  		,y_m
  	from
  		bilibili_m2
  	where
  		year(m_date)=2021
),
typ_income as
(select
	user_id
    ,u.begin_date
    ,u.end_date
    ,pay_amount
    ,duration
    ,pay_amount/duration as daily_income
    ,typ
    ,t.m_date
    ,y_m
from
	time_2021 as t
    join
    userpurchase as u
    on
    t.m_date between u.begin_date and u.end_date)
select
	round(sum(case when typ='月卡' then daily_income end),2) as yue
    ,round(sum(case when typ='季卡' then daily_income end),2) as ji
    ,round(sum(case when typ='半年卡' then daily_income end),2) as bannian
    ,round(sum(case when typ='年卡' then daily_income end),2) as nian
    ,round(sum(daily_income),2) as total
from
	typ_income
答案输出结果:yearly_card	half_yearly_card	quarterly_card	monthly_card	total_income
3168.8370	1558.5947	506.6324	514.8078	5748.87 

你的输出结果:
yue	ji	bannian	nian	total
514.78	506.64	1558.59	3169.18	5749.19。

你看下Refrain的评论,是小数点的问题。
另外,手动点赞,你用天数来定义月季年会员,很棒!
2025-05-12 各地区包含“中国”或“中”字开头的公司数量及比例 
select
	area
	,count(*) as total_companies
    ,count(case when (name like '%中国%' or name like '中%') then ts_code else null end) as chinese_named_companies
    ,round(count(case when (name like '%中国%' or name like '中%') then ts_code else null end )/count(*),3) as proportion
from
	stock_info
group by
	1
order by
	3 desc
limit
	5
这个地方的计算逻辑,我有点不太明白,代码就是计数统计那边有区别,结果差的很大
select
	area
	,count(*) as total_companies
    ,count(case when (name like '%中国%' or name like '中%') then ts_code else null end) as chinese_named_companies
    ,round(count(case when (name like '%中国%' or name like '中%') then ts_code else null end )/count(*),3) as proportion
from
	stock_info
group by
	1
order by
	4 desc ===按比例排序哦,你写成第3列了。
limit
	5
2025-05-09 按交易所统计软件服务、银行上市公司数量 
我想问一下,这个地方题目说,要有六行,但是结果只有五行。如果完成题目六行的表,我是不是要新链接一个表,包含六行数据,然后临时表 左连接 参考答案里面的表,空值替换为0
不可能有6行了。因为不存在bj上市的银行。
2025-05-06 互相发过红包的好友关系对 
select
	t2.snd_usr_id as user_a
    ,t2.rcv_usr_id as user_b
from
	tx_red_pkt_rcd as t1
    left join
    tx_red_pkt_rcd as t2
    on
    t2.snd_usr_id=t1.rcv_usr_id
where
	t2.rcv_usr_id=t1.snd_usr_id
    and
    DATE(t1.snd_datetime) = '2021-02-13'
    AND DATE(t2.snd_datetime) = '2021-02-13'
    and
    	t1.snd_usr_id<t1.rcv_usr_id
group by
	1,2
输出结果不对
你代码写错了,and t1.snd_usr_id 这是个啥条件??
2025-05-04 计算红包接收率 
哦哦,我知道了
啥也没说
2025-05-04 计算红包接收率 
这个代码应该没有错呀

select
	count(case when date(snd_datetime)='2021-02-13' then snd_usr_id end ) as
total_sent
    ,count(case when date(rcv_datetime)='1900-01-01' then snd_usr_id end ) as received_count
    ,100-round(count(case when date(rcv_datetime)='1900-01-01' then snd_usr_id end )/nullif(count(case when date(snd_datetime)='2021-02-13' then snd_usr_id end ),0)*100,2) as receive_rate
from
	tx_red_pkt_rcd
啥也没说
2025-05-04 连续登录3天及以上 
不知道为什么错
WITH user_login_days AS (
select
	usr_id
    ,date(login_time) as login_date
from
	user_login_log
where
	login_time >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)),
rnk_user as(
	select
  		usr_id
  		,login_date
  		,row_number()over(partition by usr_id order by login_date) as rnk 
	from
  		user_login_days
),
rnk_sub as(
	select
  		usr_id
  		,login_date
  		,rnk
  		,date_sub(login_date,interval rnk day) as date_rnk
  	from
		rnk_user
),
all_usr as(
select
	usr_id
    ,date_rnk
    ,count(*) as 3_day_active
from
	rnk_sub
group by
	1,2
having
	3_day_active>=3)
select
	usr_id
    ,date_rnk
    ,date_add(date_rnk,interval 3_day_active day) as end_date
    ,3_day_active
from
	all_usr
一天登陆了多次的该怎么办呢?也要去重的呀
2025-05-03 从商品角度统计收藏到购买的转化率 
	select
		f.mch_id
        ,count(distinct	f.cust_uid)
        ,count(distinct p.cust_uid)
        ,round(count(distinct	p.cust_uid)/count(distinct f.cust_uid)*100,2)
  	from
  		xhs_fav_rcd as f
  		left join
  		xhs_pchs_rcd as p
  		on	p.cust_uid=f.cust_uid
  		and p.mch_id=f.mch_id
		and datediff(f.fav_tm,p.pchs_tm)<0
 	group by	
    	1
    order by
    	4 desc
还是有点迷糊,这一块,我的第一列应该没问题,但是第二列的问题不知道怎么解决
datediff(f.fav_tm,p.pchs_tm)<0,这一句,同一天9点收藏的, 12点下单的。请问是否小于0?
2025-05-03 先收藏后购买的用户数 
select
	count(distinct f.cust_uid) as 
conversion_user_count
from
	xhs_fav_rcd as f
    left join
    xhs_pchs_rcd as p
    on f.cust_uid=p.cust_uid
    and f.mch_id=p.mch_id
    and datediff(f.fav_tm,p.pchs_tm)<0
这个地方如果在表连接里面使用时间判断,左表有数据,右表没有数据,但是最后这条记录还是会保存下来对吧
是的。左链接跟链接条件没有关系哦。左链接:左有就有
2025-04-28 整体的点击率 
这个地方的实际计算公式我可不可以理解成为:一段时间内,(总共浏览的几个page_id/总的浏览次数)
不是哦。这里的点击率是指搜索到点击的点击率。比如你在京东搜索了【男士护肤】,出现了可以无限下滑的商品列表,你点击了【巴黎欧莱雅】、【馥佩】2个商品列表,那这次转化率就是200%。这里的转化率是可以大于1的。
2025-04-27 计算每天的有效订单完成率 
我没有懂这里算每天的总订单量,为什么还是要用cty这个字段,那跟上一题有什么区别。如果不用cty,用日期进行分组,你获得的数字又不对
已修正,你自己写出答案了么
2025-04-26 得物面试真题(1)每周iPhone用户和非iPhone用户 
就是我刚刚的代码输出不了202350这一周的数据
但我又不知道为什么
啥也没说
2025-04-26 得物面试真题(1)每周iPhone用户和非iPhone用户 


select
	concat(year(purchase_time),lpad(week(purchase_time),2,'0')) as week_number
    ,case when product_type='iPhone' then product_type
    		else 'Not iPhone' end as category
    ,count(distinct user_id) as user_count
from
	apple_pchs_rcd
group by
	1,2
order by
	1,2
跟答案只有输出星期那一列有区别,但为什么数量差别这么大
yearweek\week是不同的函数哦。通过取第一天加深对该函数的认识,2023-01-01、2024-01-01、2025-01-01会有不同的答案,去试试,用两个函数都试一下
2025-04-25 餐厅的客单价增长趋势 
这个地方为什么又是avg,而不是sum(trx_amt)/count(distinct cust_id)了
客单价是指每个用户在某个时间段内的【平均每笔消费金额】。我这里标红了呀。客单价在不同口径下会有不同的定义,分母可以是用户数或订单数哦
2025-04-22 每日新增用户 
不懂我的代码为什么运行不出来答案那个结果
select
	date(login_time)
    ,count(*) as cnt
from
	user_login_log
where
	date(login_time) between '2024-09-01' and '2024-09-30'
    and (usr_id,login_time)in
	(select
		usr_id
    	,min(login_time)
	from
		user_login_log
	group by
		1)
group by
	1
order by
	1
已修改。
2025-04-22 通勤、午休、临睡个时间段活跃人数分布 
我知道要输出各个时间段的人,而不是登录用户数量,但是我有个疑问,通勤是两个时间段,那么早上有通勤记录,跟晚上有通勤记录,应该算作两次吧(单纯从理解这块的入手的话)
with login_typ  as (
SELECT
    usr_id,
    CASE
        WHEN TIME(login_time) BETWEEN '07:30:00' AND '09:30:00' OR TIME(login_time) BETWEEN '18:30:00' AND '20:30:00'
            THEN 'commute'
        WHEN TIME(login_time) BETWEEN '11:30:00' AND '14:00:00'
            THEN 'break'
        WHEN TIME(login_time) BETWEEN '22:30:00' AND '23:59:59' OR TIME(login_time) BETWEEN '00:00:00' AND '01:00:00'
            THEN 'bedtime'
        ELSE NULL
    END AS login_typ
FROM
    user_login_log
WHERE
    EXTRACT(YEAR FROM login_time) = EXTRACT(YEAR FROM CURRENT_DATE)
    AND EXTRACT(MONTH FROM login_time) = EXTRACT(MONTH FROM CURRENT_DATE - INTERVAL 1 MONTH)    )
 
select
	sum(case when	login_typ='commute' then 1 else 0 end ) as commute
    ,sum(case when	login_typ='break' then 1 else 0 end ) as lunch_break
    ,sum(case when	login_typ='bedtime' then 1 else 0 end ) as bedtime
from
	login_typ
感觉你还是没理解透。月活、日活代指活跃用户数,要count distinct user_id;你代码写的是次数。 饿了么月活用户1000万,下单5000万单,一个是count distinct,一个是count 1,注意分辨哦
2025-04-15 德州扑克起手牌-同花 
这里的组合就是card1跟card2牌型不一样也是一种结果,所以前两个输出的都要除以2对吧
是的,打牌只看组合,不看排列。给你先发A再发K、跟先发K再发A是一样的。
2025-03-25 每年地产与软件服务上市公司对比 
明白了
啥也没说
2025-03-19 每年地产与软件服务上市公司对比 
老师我想知道怎么才能不漏掉两类股票都没上市的年份。难道是我还需要自己生成一个新表表述2000-2024年吗
with fillter_stock as (
select
	ts_code
    ,symbol
    ,name
   	,area
    ,case when
    	industry='软件服务' then industry
        else '地产'
        end as industry
    ,list_date
    
from
	stock_info
where
	(year(list_date) between '2000' and '2024')
    and industry rlike '地产|软件服务'),
done_stock as(
	select
  		year(list_date) as Y
  		,industry
  		,count(industry) as stock_cnt
  	from
  		fillter_stock
  	group by
  		year(list_date)
  		,industry
)
select
	Y
    ,max(case when	industry='地产' then stock_cnt else 0 end) as '地产'
    ,max(case when	industry='软件服务' then stock_cnt else 0 end) as '软件服务'
from
	done_stock
group by
	1
order by
	1
以2005年为例,两类都是0。因为你先过滤了,所以你的表里不会出现2005,而答案的解法是不过滤,用case when去计数。如果你就是想过滤、又要求出现2005,那只能把
完整年份表放左边。
2025-02-14 一元一次函数形成的等腰三角形 
c如果是0了,那就是一个点啊。
select
	*
from
	numbers_for_fun
where
	a=0
    and c!=0
    and (b=1 or b=-1)
order by	
	id
啥也没说
2025-02-13 窗口函数(4)越来越喜欢召妓,窗口函数用累计(2) 
我其实不是特别理解有两个点:
1、我这个代码输出的结果里面,第一行是null。但是我临时表1是输出的不同的2023月份,第二章合并表用的inner join,第三个临时表只是新增一列,应该不会出现null值
2、我第一个表中distinct 月份,然后第三章表进行左连接,4月份还掉了?按理来说date_table应该是完整的月份呀,不会出现这么大的误差的。
with date_summary as (
	select	
  		distinct date_format(date_value,'%Y-%m') as trx_mon
  	from
  		date_table
  	where	
  		year(date_value)=2023
),
monthly_sum_trx_amt as (
	select	
  		substr(trx_time,1,7) as trx_mon
  		,sum(trx_amt) as month_trx_amt
  	from
  		cmb_usr_trx_rcd as u
  		inner join
  		cmb_mch_typ as m
  		using (mch_nm)
  	where
  		usr_id=5201314520
 		and mch_typ='休闲娱乐'
  		and year(trx_time)=2023
  	group by
  		substr(trx_time,1,7)
),
total_sum_trx_amt as (
	select
  		m1.trx_mon
  		,coalesce(month_trx_amt,0) as month_trx_amt
  		,sum(month_trx_amt)over(order by m1.trx_mon ) as trx_amt
  	from
  		date_summary as d1
  		left join
  		monthly_sum_trx_amt as m1
  		on d1.trx_mon=m1.trx_mon
)

select
	trx_mon
    ,trx_amt
from
	total_sum_trx_amt
order by
	trx_mon
如电话沟通,只需改动两个字母即可通过运行。
同学你表链接没学好,罚你重写5道left join的题。
2025-02-11 数学成绩分段统计(1) 
with math_rank as (
select
	student_id
    ,case when
    	score>=110 then '[110, 120]'
        when 
        	score between 90 and 109 then '[90, 110)'
        when
        	score between 60 and 89 then '[60, 90)'
  		else '[0, 60)'
        end as score_range
from
	scores
where	
	exam_date='2024-06-30'
    and subject='数学')

select
	score_range
    ,count(*) as num_students
from
	math_rank
group by
	score_range
这个地方我输出的结果也是一样的,但是跟他排序不同。我想知道在这种情况下需要怎么排序,才能排成答案这种。
加ORDER BY score_range DESC就好啦
2025-02-11 HAVING-语数英优异的学生 
我一开始以为要算所有科目的总分
先看需求,胸有成表;没有成表再看输出示例~
2025-02-10 条件过滤-符合条件的班主任 
select	
	name
    ,subject
    ,class_code
    ,qualification
from
	teachers
where
	fir_degr rlike '北京大学|清华大学'
	and head_teacher is not null
order by
	name asc
我用正则表达式进行匹配为什么不对
复制你的代码,试了可以通过呀。|、| 是不是把竖线输入错啦
2025-02-10 化学老师的教学成果 
我不太懂这个去重标准,比如姜建华,他考了四次,都没及格,然后我把他归为不及格,但如果有一个人四次考试及格一次,那我是算作及格还是不及格
这里的标准是:凡是有过不及格记录的,都算作不及格。
2025-02-10 S1年级物理成绩前10名(1) 
select
	st.student_id
    ,st.name
    ,sc.score
    ,row_number()over(order by sc.score desc) as rnk
from
	students as st
    inner join 
    scores	as sc
    on st.student_id=sc.student_id
where
	st.grade_code='S1'
    and sc.subject='物理'
ORDER BY 
    rnk,st.student_id
limit
	10
就我这个也能输出排名前十,但是跟答案不一样,我筛选的条件也是s1学物理的
纯按分数来,前十名有很多个。邱博、蔡帆、张海燕、黃兵都是97分。

这道题用来启发思考的,如果想把分数排前10名(不管是12个同学还是15个同学)所有同学都找出来,应该用什么排序排序窗口函数、是否要用limit
2025-02-10 大结局(😊)渣男9月爽翻天,罪证送他去西天 
上一题中代码里面用的count计数,然后在临时表里没有group by,后面调用的时候进行分组。但是我在这个题中我在第二个计算各个类型频次表里,用count代替sum计数,然后我对日期和这几个count分组,输出不了结果,我不太明白为什么。
with quarterly_transactions as (
    select
        *,
        concat(year(trx_time), '-', 'Q', quarter(trx_time)) as trx_quarter, /* 提取交易日期的年份和季度 */
        count(case when trx_amt = 288 then trx_amt else null end) over (order by concat(year(trx_time), '-', 'Q', quarter(trx_time))) as withhand, /* 计算金额为 288 的交易次数,并按季度累积 */
        count(case when trx_amt = 888 then trx_amt else null end) over (order by concat(year(trx_time), '-', 'Q', quarter(trx_time))) as doi /* 计算金额为 888 的交易次数,并按季度累积 */
    from
        cmb_usr_trx_rcd
    where
        usr_id = 5201314520 /* 限制查询结果只包含特定用户的交易记录 */ and mch_nm='红玫瑰按摩保健休闲'
and year(trx_time) in (2023,2024)
)
select
    trx_quarter,
    withhand,
    doi
from
    quarterly_transactions
group by
    trx_quarter,
    withhand,
    doi /* 按季度分组,并选择累积的交易次数 */
order by
    trx_quarter; /* 按季度排序 */
sum case when 和count case when绝大多数情况下都是可以等价互换的,你再试试是不是其他条件漏写啦
2025-02-06 十大恩客你排第一,给钱金主数你多 
在2022-2023两年期间,谁在红浪漫消费的金额最多?

输出客户id、总金额,取前5行。
题目要的跟代码给的不一样啊
改了改了😁
2025-02-03 时间日期(1)按月统计日花费,一天都不要浪费 
select
	substr(DATE_FORMAT(trx_time, '%Y-%m-%d'),1,7) as trx_mon
    ,max(last_day(trx_time)) as last_day
from
	cmb_usr_trx_rcd
where
	year(trx_time) in (2023,2024)
group by
	substr(DATE_FORMAT(trx_time, '%Y-%m-%d'),1,7)
order by
	trx_mon
我想问一个语法方面的问题,就是有了group分组之后,是不是输出的列也是要聚合好的。我一开始lastday那里没有加max,就是无法输出。
select a,b from t group by a会报错的;select a, max(b) from t group by a不会报错。你的语句不加max,就是犯了第一种错
2025-01-31 输出特定日期上市的银行 
select
	*
from
	stock_info
where 
	 (list_date between '2026-06-01' and '2006-09-01')
     and industry ='银行'
     and area='北京'
order by
	list_date
不懂为啥没有输出
你写成2026了哥哥
2024-12-26 小结(1)大数据早就能扫黄,找足证据不慌张 
明白了,between 2 and4 里面只有2、3
between 2 and 5 里面有 2、3、4
别闹。。 between and 是前闭后闭的 234;2345
2024-12-26 小结(1)大数据早就能扫黄,找足证据不慌张 
第一种跟第二种输出答案不一样
只有一个答案是正确的
2024-12-24 分组与聚合函数(5)五花八门的项目,其实都有固定套路(2) 
select
	trx_amt
	,count(trx_amt) as total_trx_cnt
    ,count(distinct usr_id) as unique_usr_cnt
    ,count(trx_amt)/count(distinct usr_id) as avg_trx_per_user
from cmb_usr_trx_rcd
where mch_nm='红玫瑰按摩保健休闲'
	and year(trx_time)=2023 and month(trx_time) in (1,2,3,4,5,6,7,8,9,10,11,12)
	or year(trx_time)=2024 and month(trx_time) in (1,2,3,4,5,6)
group by trx_amt
order by avg_trx_per_user desc
limit 5

我输出的结果中total_trx_cnt 数量不对,288.888,1288都多了两个
啥也没说
2024-12-23 分组与聚合函数(2)擦边营收怎么样,聚合函数可看出 
我想问问为什么selet中trx_time不能用date_format(trx_time,%y-%m-%d) as trx_date
当然可以,但你大小写没写对呀。

提交记录

提交日期 题目名称 提交代码
2025-05-25 夜深与专车比例的相关系数 
WITH TimeExtracted AS (
    SELECT 
        cust_uid,
        start_loc,
        end_loc,
        CAST(SUBSTR(start_tm, 4, 2) AS SIGNED) AS hour,
        car_cls
    FROM didi_sht_rcd
),
FilteredData AS (
    SELECT *
    FROM TimeExtracted
    WHERE hour BETWEEN 18 AND 23
),
HourlyCounts AS (
    SELECT
        hour,
        COUNT(*) AS total_orders,
        SUM(CASE WHEN car_cls = 'A' THEN 1 ELSE 0 END) AS A_orders
    FROM FilteredData
    GROUP BY hour
),
PercentageAOrders AS (
    SELECT
        hour,
        (A_orders * 100.0 / total_orders) AS percentage_A_orders
    FROM HourlyCounts
),
AggregatedData AS (
    SELECT
        COUNT(*) AS n,
        SUM(hour) AS sum_x,
        SUM(percentage_A_orders) AS sum_y,
        SUM(hour * percentage_A_orders) AS sum_xy,
        SUM(hour * hour) AS sum_x2,
        SUM(percentage_A_orders * percentage_A_orders) AS sum_y2
    FROM PercentageAOrders
)
SELECT
    cast((n * sum_xy - sum_x * sum_y) / 
    (SQRT(n * sum_x2 - sum_x * sum_x) * SQRT(n * sum_y2 - sum_y * sum_y)) as decimal(10,2)) AS R
FROM AggregatedData;
2025-05-23 晚时段专车比例 
WITH TimeExtracted AS (
    SELECT 
        cust_uid,
        start_loc,
        end_loc,
        SUBSTR(start_tm, 1, 2) AS hour,
        car_cls
    FROM didi_sht_rcd
),
FilteredData AS (
    SELECT *
    FROM TimeExtracted
    WHERE hour BETWEEN '18' AND '23'
),
HourlyCounts AS (
    SELECT
        hour,
        COUNT(*) AS total_orders,
        SUM(CASE WHEN car_cls = 'A' THEN 1 ELSE 0 END) AS A_orders
    FROM FilteredData
    GROUP BY hour
)
SELECT
    hour,
    A_orders,
    total_orders,
    cast((A_orders * 100.0 / total_orders) as decimal(10,2)) AS percentage_A_orders
FROM HourlyCounts
ORDER BY hour;
2025-05-23 找出所有类别组合的最热门路线 
with all_cnt as(
select
d1.start_loc
,d1.end_loc
,l1.loc_ctg
,l2.loc_ctg as ctg2
,count(*) as cnt
from
didi_sht_rcd as d1
join
loc_nm_ctg as l1
on
d1.start_loc=l1.loc_nm
join
loc_nm_ctg as l2
on
d1.end_loc=l2.loc_nm
group by
3,4,1,2
),
rank_typ as(
select
*
,dense_rank()over(partition by loc_ctg,ctg2 order by cnt desc) as rnk
from
all_cnt
)
select
start_loc
,end_loc
,loc_ctg
,ctg2
,cnt
from
rank_typ
where
rnk=1
order by
cnt desc,
loc_ctg, ctg2;
2025-05-23 找出所有类别组合的最热门路线 
with all_cnt as(
select
d1.start_loc
,d1.end_loc
,l1.loc_ctg
,l2.loc_ctg as ctg2
,count(*) as cnt
from
didi_sht_rcd as d1
join
loc_nm_ctg as l1
on
d1.start_loc=l1.loc_nm
join
loc_nm_ctg as l2
on
d1.end_loc=l2.loc_nm
group by
3,4,1,2
),
rank_typ as(
select
*
,row_number()over(partition by loc_ctg,ctg2 order by cnt desc) as rnk
from
all_cnt
)
select
start_loc
,end_loc
,loc_ctg
,ctg2
,cnt
from
rank_typ
where
rnk=1
ORDER BY 
cnt DESC,
loc_ctg, ctg2;
2025-05-23 找出所有类别组合的最热门路线 
with all_cnt as(
	select
		d1.start_loc
		,d1.end_loc
		,l1.loc_ctg
		,l2.loc_ctg as ctg2
		,count(*) as cnt
	from
		didi_sht_rcd as d1
		join
		loc_nm_ctg as l1
		on
		d1.start_loc=l1.loc_nm
		join
		loc_nm_ctg as l2
		on
		d1.end_loc=l2.loc_nm
		group by
			3,4,1,2
		order by
			5 desc
),
rank_typ as(
	select
		*
		,row_number()over(partition by loc_ctg,ctg2 order by cnt desc) as rnk
	from
		all_cnt
)
select
	start_loc
,end_loc
,loc_ctg
,ctg2
,cnt
from
	rank_typ
where
	rnk=1
2025-05-23 找出所有以酒店为起点或终点的类别组合的最热门路线 
with all_typ as
(select
	d1.start_loc
,d1.end_loc
,l1.loc_ctg
,l2.loc_ctg as ctg2
,count(*) as cnt
from
	didi_sht_rcd as d1
join
loc_nm_ctg as l1
on
d1.start_loc=l1.loc_nm
join
loc_nm_ctg as l2
on
d1.end_loc=l2.loc_nm
where
	l1.loc_ctg='酒店'
or
l2.loc_ctg='酒店'
group by
	3,4,1,2),
rank_typ as
(select
	*
,row_number()over(partition by loc_ctg,ctg2 order by cnt desc) as rnk
from
	all_typ
order by
	cnt desc)
select
	start_loc,
end_loc,
loc_ctg,
ctg2 as end_ctg
,cnt
from
	rank_typ
where
	rnk=1
2025-05-23 找出所有以酒店为起点的类别组合的最热门路线 
with typ_cnt as(
	select	
		d1.start_loc
		,d1.end_loc
		,l2.loc_ctg
		,count(*) as cnt
	from
		didi_sht_rcd as d1
		join
		loc_nm_ctg as l1
		on
		d1.start_loc=l1.loc_nm
		join
		loc_nm_ctg as l2
		on
		d1.end_loc=l2.loc_nm
	where	
		l1.loc_ctg='酒店'
	group by
		1,3,2
	order by
		4 desc
),
rnk_canguan as(
	select
		*
		,row_number()over(partition by loc_ctg order by cnt desc) as rnk
	from
		typ_cnt
)
select
	start_loc
,end_loc
,loc_ctg
,cnt
from
	rnk_canguan
where
	rnk=1
2025-05-23 找出所有以酒店为起点的类别组合的最热门路线 
with typ_cnt as(
	select	
		d1.start_loc
		,d1.end_loc
		,l2.loc_ctg
		,count(*) as cnt
	from
		didi_sht_rcd as d1
		join
		loc_nm_ctg as l1
		on
		d1.start_loc=l1.loc_nm
		join
		loc_nm_ctg as l2
		on
		d1.end_loc=l2.loc_nm
	where	
		l1.loc_ctg='酒店'
	group by
		1,3,2
	order by
		4 desc
),
rnk_canguan as(
	select
		*
		,row_number()over(partition by start_loc,loc_ctg order by cnt desc) as rnk
	from
		typ_cnt
)
select
	start_loc
,end_loc
,loc_ctg
,cnt
from
	rnk_canguan
where
	rnk=1
2025-05-23 找出所有以酒店为起点的类别组合的最热门路线 
WITH hotel_routes AS (
    SELECT r.start_loc, r.end_loc, l_end.loc_ctg, COUNT(*) AS trip_count
    FROM didi_sht_rcd r
    JOIN loc_nm_ctg l_start ON r.start_loc = l_start.loc_nm
    JOIN loc_nm_ctg l_end ON r.end_loc = l_end.loc_nm
    WHERE l_start.loc_ctg = '酒店'
    GROUP BY r.start_loc, r.end_loc, l_end.loc_ctg
),
ranked_routes AS (
    SELECT start_loc, end_loc, loc_ctg, trip_count,
           ROW_NUMBER() OVER (PARTITION BY loc_ctg ORDER BY trip_count DESC) AS route_rank
    FROM hotel_routes
)
SELECT start_loc, end_loc, loc_ctg, trip_count
FROM ranked_routes
WHERE route_rank = 1
ORDER BY trip_count DESC;
2025-05-23 找出所有以酒店为起点的类别组合的最热门路线 
WITH hotel_to_cafe AS (
    SELECT r.start_loc, r.end_loc, COUNT(*) AS trip_count
    FROM didi_sht_rcd r
    JOIN loc_nm_ctg l_start ON r.start_loc = l_start.loc_nm
    WHERE l_start.loc_ctg = '酒店' 
    GROUP BY r.start_loc, r.end_loc
)
SELECT start_loc, end_loc, trip_count
FROM hotel_to_cafe
ORDER BY trip_count DESC
LIMIT 1;
2025-05-23 找出酒店-餐饮的最热门路线 
SELECT 
	r.start_loc
,r.end_loc
,count(1) as cnt
FROM didi_sht_rcd r
JOIN loc_nm_ctg l_start ON r.start_loc = l_start.loc_nm
JOIN loc_nm_ctg l_end ON r.end_loc = l_end.loc_nm
WHERE l_start.loc_ctg = '酒店'
  AND l_end.loc_ctg = '餐饮'
group by
	1,2
order by
	3 desc
limit
	1
2025-05-23 查询所有以住宅区为起点且以写字楼为终点的行程 
select
	cust_uid
,d1.start_loc
,d1.end_loc
,start_tm
,car_cls
from
	didi_sht_rcd as d1
join
loc_nm_ctg as t1
on
d1.start_loc=t1.loc_nm
join
loc_nm_ctg as t2
on
d1.end_loc=t2.loc_nm
where
	t1.loc_ctg='住宅'
and
t2.loc_ctg='写字楼'
2025-05-23 查询所有起点和终点都属于餐饮类别的行程 
SELECT 
t1.*
FROM 
didi_sht_rcd t1
WHERE 
EXISTS (
SELECT 1 
FROM loc_nm_ctg t2 
WHERE t2.loc_nm = t1.start_loc AND t2.loc_ctg = '餐饮'
)
AND 
EXISTS (
SELECT 1 
FROM loc_nm_ctg t3 
WHERE t3.loc_nm = t1.end_loc AND t3.loc_ctg = '餐饮'
);
2025-05-19 滴滴面试真题(4)未完成订单在第二天继续呼叫的比例 
WITH ConvertedTime AS (
    SELECT 
        order_id,
        cust_uid,
        DATE_ADD(call_time, INTERVAL -3 HOUR) AS local_call_time,
        grab_time,
        cancel_time,
        finish_time
    FROM 
        didi_order_rcd
),
CallDates AS (
    SELECT 
        order_id,
        cust_uid,
        local_call_time,
        DATE(local_call_time) AS call_date
    FROM 
        ConvertedTime
),
NextDayCalls AS (
    SELECT 
        c1.order_id AS order_id_1,
        c1.call_date AS call_date_1,
        c2.order_id AS order_id_2,
        c2.call_date AS call_date_2
    FROM 
        CallDates c1
    JOIN 
        CallDates c2
    ON 
        c2.call_date = DATE_ADD(c1.call_date, INTERVAL 1 DAY)
    AND 
        c1.cust_uid = c2.cust_uid
  AND 
        c1.order_id = c2.order_id
),
NextDayCallCount AS (
    SELECT 
        COUNT(DISTINCT order_id_1) AS next_day_call_count
    FROM 
        NextDayCalls
),
TotalOrderCount AS (
    SELECT 
        COUNT(order_id) AS total_order_count
    FROM 
        didi_order_rcd
  where finish_time = '1970-01-01 00:00:00'
)
SELECT 
    ncc.next_day_call_count,
    toc.total_order_count,
    CONCAT(FORMAT((ncc.next_day_call_count * 1.0 / toc.total_order_count) * 100, 2), '%') AS next_day_call_ratio
FROM 
    NextDayCallCount ncc,
    TotalOrderCount toc;
2025-05-19 滴滴面试真题(3)UTC转化后的本地时间呼叫高峰期 
select hour(local_call_time) as local_hour, count(1) as cnt
from 
(
    SELECT 
        order_id,
        cust_uid,
        DATE_ADD(call_time, INTERVAL -3 HOUR) AS local_call_time,
        grab_time,
        cancel_time,
        finish_time
    FROM 
        didi_order_rcd
)t
group by hour(local_call_time) order by 2 desc
2025-05-19 滴滴面试真题(2)打车订单呼叫应答时间 
select
	avg(timestampdiff(second,call_time,grab_time)) as avg_response_time_seconds
from
	didi_order_rcd
where
	grab_time!='1970-01-01 00:00:00'
2025-05-19 滴滴面试真题(1)-打车订单应答率 
select
	count(distinct order_id) as total_call
,count(distinct case when grab_time!='1970-01-01 00:00:00' THEN order_id end ) as grab_cnt 
,concat(round(count(distinct case when grab_time!='1970-01-01 00:00:00' THEN order_id end )/count(distinct order_id)*100,2),'%') as answer_rate
from
	didi_order_rcd
where
	date(call_time)='2021-05-03'
2025-05-18 查找没有观看记录的用户 
select
	distinct bt1.usr_id
from
	bilibili_t100 bt1
left join
bilibili_t20 bt2
on 
bt1.usr_id=bt2.usr_id
where
	v_id is null
2025-05-18 查找没有观看记录的用户 
select
	bt1.usr_id
from
	bilibili_t100 bt1
left join
bilibili_t20 bt2
on 
bt1.usr_id=bt2.usr_id
where
	v_id is null
2025-05-18 会员与非会员的日均观看视频数量 
SELECT 
    m_flg,
    round(AVG(daily_views),2) AS avg_daily_views
FROM (
    SELECT 
        t100.usr_id,
        m_flg,
        DATE(v_tm) AS view_date,
        COUNT(v_id) AS daily_views
    FROM 
        bilibili_t100 t100
    JOIN 
        bilibili_t20 t20 ON t100.usr_id = t20.usr_id
    WHERE 
        DATE(v_tm) >= '2021-02-01' AND DATE(v_tm) <= '2021-02-28'
    GROUP BY 
        t100.usr_id, m_flg, view_date
) AS subquery
GROUP BY 
    m_flg;