排名

用户解题统计

过去一年提交了

勋章 ①金银铜:在竞赛中获得第一二三名;②好习惯:自然月10天提交;③里程碑:解决1/2/5/10/20/50/100/200题;④每周打卡挑战:完成每周5题,每年1月1日清零。

收藏

收藏日期 题目名称 解决状态
2026-03-27 计算每个用户的受欢迎程度  已解决
2025-04-21 客群分布直方图  已解决
2025-04-21 统计每一类RFM用户的数量  已解决
2025-04-21 4分及以下客户去的最后一家餐厅  已解决
2025-04-21 9分客户最爱去的Top3餐厅  已解决
2025-04-19 直观对比两种频率计算的差异(F)  已解决
2025-04-15 得物面试真题(4)首单Mac二单iPhone的客户  已解决
2025-04-15 得物面试真题(1)每周iPhone用户和非iPhone用户  已解决
2025-04-15 各商品漏斗转化率  已解决
2025-04-15 从商品角度统计收藏到购买的转化率  已解决
2025-04-14 好友步数排名-考虑反向好友关系  已解决
2025-04-14 好友步数排名-不考虑反向好友关系  已解决
2025-04-13 会员与非会员的日均观看视频数量  已解决
2025-04-13 每个视频类型的T+3留存率  已解决
2025-04-13 钢铁直男的观看记录  已解决
2025-04-13 只观看放映厅的用户  已解决
2025-04-13 每天新增用户的会员转化比例  已解决
2025-04-13 每天新增会员  已解决
2025-04-13 哔哩哔哩面试真题(6)全年会员收入分类  已解决
2025-04-13 哔哩哔哩面试真题(4)每周分摊会员收入  已解决
2025-04-13 哔哩哔哩面试真题(2)计算春节周会员收入  已解决
2025-04-13 频道重合指数  已解决
2025-04-13 多类别观看指数计算  已解决
2025-04-12 绘制小时进入人数曲线  已解决
2025-04-12 快手面试真题(3)同时在线人数峰值时点  已解决
2025-04-11 快手面试真题(2)同时在线人数峰值  已解决
2025-04-10 播放量最高的标签  未解决
2025-04-10 计算完播率(按人数)  已解决
2025-04-09 夜深与专车比例的相关系数  已解决
2025-04-09 晚时段专车比例  已解决
2025-04-09 滴滴面试真题(4)未完成订单在第二天继续呼叫的比例  已解决
2025-04-09 滴滴面试真题(2)打车订单呼叫应答时间  已解决
2025-04-09 滴滴面试真题(1)-打车订单应答率  已解决
2025-04-09 连续登录3天及以上  已解决
2025-04-09 上月活跃用户数  已解决
2025-04-09 7月之后再也没活跃过的用户  已解决
2025-04-09 10月1日后再也没活跃过的用户  已解决
2025-04-08 抖音面试真题(4)T+1月留存  已解决
2025-04-08 抖音面试真题(5)新用户的T+1月留存  已解决
2025-04-08 抖音面试真题(3)一次性计算T+3、T+7、T+14留存率(2)  已解决
2025-04-08 抖音面试真题(2)一次性计算T+1、T+3、T+7、T+14留存率  已解决
2025-04-07 抖音面试真题(1)T+1日留存率  已解决
2025-04-07 统计每个城市各状态的单量  未解决
2025-04-07 计算每个城市的有效订单完成率  已解决
2025-03-10 基于共同兴趣爱好的餐厅推荐(6)-好基友(5)  已解决
2025-03-10 多云天气天数  已解决
2025-01-26 小丑竟是我自己  未解决
2025-01-25 海王:1小时发3个红包  已解决
2025-01-20 德州扑克起手牌-同花两高张  未解决
2025-01-20 通勤、午休、临睡个时间段活跃人数分布  已解决

评论笔记

评论日期 题目名称 评论内容 站长评论
2025-02-03 餐厅的客单价增长趋势 
//自己记录用
**考察coalesce用法,计算同环比场景,被做对比的数据为0时,取0
**考察nullif用法,计算比例,同环比,上期或同期为0, 做分母,避免被除数为0,nullif判断被除数是否为0,为0则取null
**考察round用法,计算同环比,保留小数,比值先乘以100,再四舍五入保留小数部位
**考察开窗函数
啥也没说
2025-01-26 小丑竟是我自己 
比例是 (1-A/B)*100 吗
是的~,你再试下应该能通过了
2025-01-26 小丑竟是我自己 
知识点:
rcv_datetime != '1900-01-01 00:00:00'  1900-01-01代表没有列值的记录,此处用作判断没有被接收的红包的记录
啥也没说
2025-01-21 德州扑克起手牌-同花两高张 
select count(1) as cnt,
			(select count(1) from hand_permutations) as ttl_cnt,
            round(count(1)/(select count(1) from hand_permutations),4) as p
from 
(select distinct card1,card2
from hand_permutations 
where right(card1,1)=right(card2,1)
and (left(card1,length(card1)-1) between 'A' and 'K') 
and (left(card2,length(card2)-1) between 'A' and 'K')) t1
啥也没说
2025-01-21 文科潜力股 
SELECT *
 FROM scores 
 where ((subject ='历史' and score>90)
 or (subject ='地理' and score>90)
 or (subject ='政治' and score>90))
 and exam_date='2024-06-30'
 order by score desc,student_id,subject
啥也没说
2025-01-21 找出所有以酒店为起点或终点的类别组合的最热门路线 
注意关联条件的列名,以及题解是求最热门的路线组合,需要再次分区(分组)排名
啥也没说
2025-01-21 至少两门科目大于等于110分的学生 
select *
 from subject_score
 where (chinese>=110 and math>=110)
 or (chinese>=110 and english>=110)
 or (math>=110 and english>=110)
 order by student_id

select a1.*
from subject_score a1
inner join 
(select student_id,
			(c1+m1+e1) as tll
from
 (select student_id,
 			case when chinese>=110 then 1 else 0 end as c1,
            case when math>=110 then 1 else 0 end as m1,
            case when english>=110 then 1 else 0 end as e1
 from subject_score) t 
) a2 
on a1.student_id=a2.student_id
where a2.tll>=2


select *
from subject_score
where if(chinese>=110,1,0)+if(math>=110,1,0)+if(english>=110,1,0) >=2
啥也没说
2025-01-21 渣男腰子可真行,端午中秋干不停 
多个where筛选条件,注意and和or的运算优先级,括号的使用
啥也没说
2025-01-21 分组与聚合函数(5)五花八门的项目,其实都有固定套路(2) 
计算客单数和消费人数,人均客单数的区别
啥也没说
2025-01-21 条件过滤(3)Hour函数很给力,组合条件要仔细 
注意输出排序
现在不用注意了😄
2025-01-21 通勤、午休、临睡个时间段活跃人数分布 
使用变量?
跟变量木有关系,本网站所有题都不需要使用变量。
2025-01-21 通勤、午休、临睡个时间段活跃人数分布 
select 
(select count(distinct usr_id)
from user_login_log
where 
 ((substr(login_time,12,8) between '07:30:00' and '09:30:00')
or (substr(login_time,12,8) between '18:30:00' and '20:30:00'))
and (substr(login_time,1,10) between date_add(current_date()-day(current_date())+1,interval -1 month) and date_add(last_day(current_date()),interval -1 month))) as commute,
(select count(distinct usr_id)
from user_login_log
where (substr(login_time,12,8) between '11:30:00' and '14:00:00')
and (substr(login_time,1,10) between date_add(current_date()-day(current_date())+1,interval -1 month) and date_add(last_day(current_date()),interval -1 month))) as lunch_break,
(select count(distinct usr_id)
from user_login_log
where (substr(login_time,12,8) between '22:30:00' and '01:00:00')
and (substr(login_time,1,10) between date_add(current_date()-day(current_date())+1,interval -1 month) and date_add(last_day(current_date()),interval -1 month))) as bedtime
啥也没说
2025-01-21 性别已知的听歌用户 
在听歌记录表取值,关联用户信息表
啥也没说
2025-01-20 各地区包含“中国”或“中”字开头的公司数量及比例 
和上题例不同,公司名称条件不用去除以“中国”结尾的,且包含“中”字的公司不需要限定以中字开头。
注意order by之后不接聚合函数,order by和聚合函数有不同的语义和执行顺序。

输出结果proportiton的小数位为4位,结果未通过。
啥也没说
2025-01-20 每个行业最早上市的公司 
只考虑那些名称以“中国”开头或包含“中”字的公司:
1. 以“中国”开头的公司,需要同时剔除以“中国”结尾的公司
2. 包含“中”字的公司,题解为以“中”字开头的公司,不是名称出现“中”字的公司
这里有一些误解
啥也没说
2025-01-19 条件过滤(1)异地男友有异常,数分闺蜜来帮忙 
使用date_format函数可以吗
可以的。实际业务中,如果用于过滤,怎么方便怎么来;如果是用于输出,要严格遵守输出要求,因为后续就要导入到pythonexcel啥的,会有格式要求,那就推荐date_format。😁

提交记录

提交日期 题目名称 提交代码
2026-03-27 快手面试真题(3)同时在线人数峰值时点 
with flag as (
select usr_id,live_id,1 as flag,enter_time as time1
from ks_live_t1
union all
select usr_id,live_id,-1 as flag,leave_time as time1
from ks_live_t1
),
base as(
select live_id,time1,sum(flag) over(partition by live_id order by time1) as cum
from flag
order by live_id,time1),
peak as(
select live_id,max(cum) as peak_cum
from base 
group by 1)
select b.live_id,k.live_nm,p.peak_cum as max_online_users,min(b.time1) as first_peak_time,
max(b.time1) as last_peak_time
from base b
inner join peak p on b.live_id=p.live_id and b.cum=p.peak_cum
inner join ks_live_t2 k on b.live_id=k.live_id
group by 1,2,3
order by 3 desc
2026-03-27 计算每个用户的受欢迎程度 
WITH friendship AS (
    SELECT
        r1.snd_usr_id AS usr_id,
        r1.rcv_usr_id AS friend_id
    FROM
        tx_red_pkt_rcd r1
    UNION
    SELECT
        r2.rcv_usr_id AS usr_id,
        r2.snd_usr_id AS friend_id
    FROM
        tx_red_pkt_rcd r2
),
user_friend_count AS (
    SELECT
        u.usr_id,
        COUNT(DISTINCT f.friend_id) AS friend_count
    FROM
        tx_usr_bas_info u
    LEFT JOIN
        friendship f ON u.usr_id = f.usr_id
    GROUP BY
        u.usr_id
),
total_users AS (
    SELECT
        COUNT(DISTINCT usr_id) AS total_user_count
    FROM
        tx_usr_bas_info
)
SELECT
    ufc.usr_id,
    ufc.friend_count,
    tu.total_user_count,
    round(ufc.friend_count / tu.total_user_count,4) AS popularity
FROM
    user_friend_count ufc,
    total_users tu
ORDER BY
    popularity DESC;
2026-03-27 计算每个用户的受欢迎程度 
with base as(
select snd_usr_id as u1, rcv_usr_id as u2
from tx_red_pkt_rcd 
union 
select rcv_usr_id as u1, snd_usr_id as u2
from tx_red_pkt_rcd 
)
select u1 as usr_id, count(u2) as friend_count,
(select count(distinct usr_id) from tx_usr_bas_info) as total_user_count, 
round(count(u2)/(select count(distinct usr_id) from tx_usr_bas_info),4)as popularity
from base
group by u1 
order by 4 desc
2026-03-27 计算每个用户的受欢迎程度 
with base as(
select snd_usr_id as u1, rcv_usr_id as u2
from tx_red_pkt_rcd 
union 
select rcv_usr_id as u1, snd_usr_id as u2
from tx_red_pkt_rcd 
)
select u1 as usr_id, count(u2) as friend_count,
(select count(distinct usr_id) from tx_usr_bas_info) as total_user_count, 
count(u2)/(select count(distinct usr_id) from tx_usr_bas_info)as popularity
from base
group by u1 
order by 4 desc
2026-03-27 计算每个用户的受欢迎程度 
with base as(
select snd_usr_id as u1, rcv_usr_id as u2
from tx_red_pkt_rcd where date(snd_datetime) = '2021-02-13' and date(rcv_datetime)!='1900-01-01'
union 
select rcv_usr_id as u1, snd_usr_id as u2
from tx_red_pkt_rcd where date(snd_datetime) = '2021-02-13' and date(rcv_datetime)!='1900-01-01'
)
select u1 as usr_id, count(u2) as friend_count,
(select count(distinct usr_id) from tx_usr_bas_info) as total_user_count, 
count(u2)/(select count(distinct usr_id) from tx_usr_bas_info)as popularity
from base
group by u1 
order by 4 desc
2026-03-27 快手面试真题(2)同时在线人数峰值 
with dt as (
select live_id, enter_time dt ,1 as flag from ks_live_t1
union all 
select live_id,leave_time dt,-1 as flag from ks_live_t1
order by 1),
 base as (
select dt.live_id,dt2.live_nm,sum(flag) over(partition by live_id order by dt) as cum 
from dt 
join ks_live_t2 dt2 on dt.live_id = dt2.live_id)
select live_id,live_nm,max(cum)
from base
group by 1,2
order by 3 desc
2026-03-26 深圳气温异常年份 
with avg_h as (
SELECT year(dt) as yr, avg(tmp_h) as avg_h
from weather_rcd_china
where city='shenzhen'
and dt between '2010-01-01' and '2022-12-31'
group by 1
order by 1),
flag as (
select yr,avg_h,lag(avg_h) over(order by yr) as l_avg_h,
 coalesce(abs(avg_h-lag(avg_h) over(order by yr)),0) as var,
 case when coalesce(abs(avg_h-lag(avg_h) over(order by yr)),0)>=1 then 1 else 0 end as flag
from avg_h
order by yr) 
select yr as year, round(avg_h,2) as avg_tmp_h, case when flag =1 then 'Yes' ELSE 'No' end as significant_change
from flag 
order by 1
2026-03-19 用户的去重IP地址数量 
select user_id,count(distinct ip_address) as cnt from user_submit_record group by user_id having count(distinct ip_address)>=2 order by 2 desc,1 asc
2026-03-19 用户的去重IP地址数量 
select user_id,count(ip_address) as cnt from user_submit_record group by user_id having count(ip_address)>=2 order by 2 desc,1 asc
2026-03-19 用户的去重IP地址数量 
select user_id,count(ip_address) as cnt from user_submit_record group by user_id having count(ip_address)>=2 order by user_id
2025-04-26 Halo出行-通勤活跃用户标签开发 
with start_end_loc_map as (
select 
id
,user_id
,start_time
,end_time
,start_loc
,end_loc
from hello_bike_riding_rcd
where (start_loc in (select loc_nm from gd_loc_map where loc_type = '写字楼') and end_loc in (select loc_nm from gd_loc_map where loc_type = '地铁站')) 
or (start_loc in (select loc_nm from gd_loc_map where loc_type = '地铁站') and end_loc in (select loc_nm from gd_loc_map where loc_type = '写字楼')) 
),
start_end_time as (
select
id
,user_id
,date_format(start_time,'%Y%m') as start_time
,date_format(end_time,'%Y%m') as end_time
,start_loc
,end_loc
from start_end_loc_map
),
riding_times_count as (
select
user_id
,start_time
,count(start_time) as riding_times_count
from start_end_time
group by
user_id
,start_time
having count(start_time) >= 5
order by
user_id
,start_time
),
continous_record as (
select
user_id
,start_time
,lag(start_time,1) over (partition by user_id order by start_time ) as previous_one_record
,lag(start_time,2) over (partition by user_id order by start_time ) as previous_two_record
from riding_times_count
),
active_customer as (
select
 distinct user_id
 ,period_diff (start_time, previous_one_record) as monthdifference1
 ,period_diff (start_time, previous_two_record) as monthdifference2
from continous_record
where period_diff (start_time, previous_one_record) = 1 and period_diff (start_time, previous_two_record) = 2
)
select
distinct user_id
,case when hello_bike_riding_rcd.user_id in (select user_id from active_customer) then 1
else 0
 end as active_tag
from hello_bike_riding_rcd
order by user_id;
2025-04-26 Halo出行-通勤活跃用户标签开发 
select user_id,
			case when MaxConsecutiveMonth <3 then 0 else 1 end as active_tag
from (
select user_id, 
		 max(ConsecutiveMonth) as MaxConsecutiveMonth
from (
select user_id,
		 grp,
 count(month1) as ConsecutiveMonth
from (
select user_id,
		 month as month1,
 row_number() over(partition by user_id order by month) as rn,
 month - interval row_number() over(partition by user_id order by month) month as grp
from (
select user_id,
		 date_format(start_time,'%Y-%m-01') as month,
 count(start_time) as monthcommutecnt
from hello_bike_riding_rcd 
where year(start_time) between 2020 and 2024
and (
(start_loc in (select loc_nm from gd_loc_map where loc_type='地铁站') and 
end_loc in (select loc_nm from gd_loc_map where loc_type='写字楼')
 ) 
		or (start_loc in (select loc_nm from gd_loc_map where loc_type='写字楼') and 
end_loc in (select loc_nm from gd_loc_map where loc_type='地铁站')
 )
)
group by 1,2
having count(start_time)>=5 
order by 1,2) t1 
) t2
group by user_id,grp 
order by user_id,grp ) t3
group by user_id 
order by user_id 
) t4
2025-04-26 小宇宙电台的同期群分析 
with UserFirstLog as (
select usr_id,
			min(date(login_time)) as first_login_date
from user_login_log
where year(login_time)=2024
group by usr_id
),
		DailyAllUserCnt as(
select date(login_time) as date,
			count(distinct usr_id) as dailyttlcnt
from user_login_log
where year(login_time)=2024
group by 1
order by 1),
		DailyUserLog as(
select distinct usr_id,date(login_time) as date,1 as tag 
from user_login_log
where date(login_time) between '2023-12-28' and '2024-12-31'
order by 1,2),
DailyNewUserCnt as(
select first_login_date,count(usr_id) as dailyNewUserCnt
from UserFirstLog
group by 1
order by 1),
		RetentUserCnt as(
select date1,count(distinct usr_id1) as retentusercnt
from(
select u1.date as date1,
			u1.usr_id as usr_id1,
u2.date as date2,
u2.usr_id as usr_id2
from DailyUserLog u1
left join DailyUserLog u2 
	on u1.usr_id=u2.usr_id and u2.date between u1.date - interval 3 day and u1.date - interval 1 day
where year(u1.date)=2024
order by u1.date,u1.usr_id) t1 
where usr_id2 is not null
group by date1),
		BackUserCnt as(
select date1, 
		count(distinct usr_id1) as BackUserCnt
from (
select u1.date as date1,
			u1.usr_id as usr_id1,
u2.date as date2,
u2.usr_id as usr_id2
from DailyUserLog u1
left join DailyUserLog u2 
	on u1.usr_id=u2.usr_id and u2.date < u1.date - interval 4 day
where year(u1.date)=2024
and u2.usr_id is not null
order by u1.date,u1.usr_id) t2 
group by date1) 
select date as login_date,
			concat(NewUserRate,', ',RetenUserRate,', ',BackUserRate) as pct
from (
select dauc.date,
dauc.dailyttlcnt as ttlUserCnt,
			coalesce(dnuc.dailyNewUserCnt,0) as NewUserCnt,
round(coalesce(dnuc.dailyNewUserCnt,0)/dauc.dailyttlcnt*100,2) as NewUserRate,
coalesce(ruc.retentusercnt,0) as retenUserCnt,
round(coalesce(ruc.retentusercnt,0)/dauc.dailyttlcnt*100,2) as RetenUserRate,
coalesce(buc.BackUserCnt,0) as backUserCnt,
round(coalesce(buc.BackUserCnt,0)/dauc.dailyttlcnt*100,2) as BackUserRate
from DailyAllUserCnt dauc
left join DailyNewUserCnt dnuc on dauc.date=dnuc.first_login_date
left join RetentUserCnt ruc on dauc.date=ruc.date1
left join BackUserCnt buc on dauc.date=buc.date1) t3
2025-04-25 招建银行信用卡中心客户挽留-电商平台分类 
with MchOrderList as(
select distinct(mch_nm) as merchant_name, substring_index(mch_nm,'-',-1) as MchAbbr
from ccb_trx_rcd)
select merchant_name,
			case when MchAbbr like '%拼多多%' or MchAbbr like '%寻梦%' then '拼多多'
		when MchAbbr like '%京东%' then '京东' 
when MchAbbr like '%淘宝%'or MchAbbr like '%天猫%' or MchAbbr like '%阿里巴巴%' then '淘宝'
when MchAbbr like '%抖音%' or MchAbbr like '%字节跳动%' then '抖音'
when MchAbbr like '%小红书%' or MchAbbr like '%行吟%' then '小红书'
else '其他'
end as platform
from MchOrderList
order by 2 desc
2025-04-25 招建银行信用卡中心客户挽留-电商平台分类 
with DistinctMchNm as (
select distinct(substring_index(mch_nm,'-',-1)) as mch_nm
from ccb_trx_rcd)
select mch_nm as merchant_name,
			case when (mch_nm like '%寻梦%' or mch_nm like '%拼多多%') then '拼多多'
		when mch_nm like '%京东%' then '京东' 
when (mch_nm like '%淘宝%'ormch_nm like '%天猫%' or mch_nm like '%阿里巴巴%') then '淘宝'
when (mch_nm like '%抖音%' or mch_nm like '%字节跳动%') then '抖音'
when (mch_nm like '%小红书%' or mch_nm like '%行吟%') then '小红书'
else '其他'
end as platform
from DistinctMchNm
order by 2 desc
2025-04-21 高价值客户及其最常访问的商户类型 
WITH RecentOrder AS (
    SELECT cust_uid, DATEDIFF(CURRENT_DATE(), MAX(trx_dt)) AS recency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
ActiveDaysFrequency AS (
    SELECT cust_uid, COUNT(DISTINCT DATE(trx_dt)) AS frequency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
AverageSpending AS (
    SELECT cust_uid, AVG(trx_amt) AS avg_monetary
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
CalculateRecencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY recency_days DESC) AS recency_score 
    FROM RecentOrder
),
CalculateFrequencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY frequency_days) AS frequency_score
    FROM ActiveDaysFrequency
),
CalculateMonetaryScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY avg_monetary) AS monetary_score
    FROM AverageSpending
),
UserRfmScores AS (
    SELECT 
        crs.user_id,
        crs.recency_score + cfs.frequency_score + cms.monetary_score AS total_score
    FROM CalculateRecencyScores crs
    JOIN CalculateFrequencyScores cfs ON crs.user_id = cfs.user_id
    JOIN CalculateMonetaryScores cms ON crs.user_id = cms.user_id
),
HighValueUsers AS (
    SELECT user_id
    FROM UserRfmScores
    WHERE total_score >= 7
),
FavoriteMerchantTypes AS (
    SELECT 
        mtrf.mch_typ2,
        COUNT(*) AS visit_count
    FROM mt_trx_rcd_f mtrf
    WHERE mtrf.cust_uid IN (SELECT user_id FROM HighValueUsers)
    GROUP BY mtrf.mch_typ2
)
SELECT 
    mch_typ2,
    visit_count
FROM FavoriteMerchantTypes
ORDER BY visit_count DESC;
2025-04-21 按商户类型统计流失风险客户 
WITH RecentOrder AS (
    SELECT cust_uid, DATEDIFF(CURRENT_DATE(), MAX(trx_dt)) AS recency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
ActiveDaysFrequency AS (
    SELECT cust_uid, COUNT(DISTINCT DATE(trx_dt)) AS frequency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
CalculateRecencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY recency_days DESC) AS recency_score
    FROM RecentOrder
),
CalculateFrequencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY frequency_days) AS frequency_score
    FROM ActiveDaysFrequency
),
UserRfmScores AS (
    SELECT 
        crs.user_id,
        crs.recency_score,
        cfs.frequency_score
    FROM CalculateRecencyScores crs
    JOIN CalculateFrequencyScores cfs ON crs.user_id = cfs.user_id
),
HighRiskUsers AS (
    SELECT user_id
    FROM UserRfmScores
    WHERE recency_score = 1 AND frequency_score <= 2
),
MerchantTypeDistribution AS (
    SELECT 
        mtrf.mch_typ2,
        COUNT(*) AS cnt
    FROM mt_trx_rcd_f mtrf
    WHERE mtrf.cust_uid IN (SELECT user_id FROM HighRiskUsers)
    GROUP BY mtrf.mch_typ2
)
SELECT 
    mch_typ2,
    cnt
FROM MerchantTypeDistribution
ORDER BY cnt DESC;
2025-04-21 客群分布直方图 
WITH RecentOrder AS (
    SELECT cust_uid, DATEDIFF(CURRENT_DATE(), MAX(trx_dt)) AS recency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
ActiveDaysFrequency AS (
    SELECT cust_uid, COUNT(DISTINCT DATE(trx_dt)) AS frequency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
AverageSpending AS (
    SELECT cust_uid, AVG(trx_amt) AS avg_monetary
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
CalculateRecencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY recency_days DESC) AS recency_score 
    FROM RecentOrder
),
CalculateFrequencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY frequency_days) AS frequency_score
    FROM ActiveDaysFrequency
),
CalculateMonetaryScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY avg_monetary) AS monetary_score
    FROM AverageSpending
),
UserRfmScores AS (
    SELECT 
        crs.user_id,
        crs.recency_score + cfs.frequency_score + cms.monetary_score AS total_score
    FROM CalculateRecencyScores crs
    JOIN CalculateFrequencyScores cfs ON crs.user_id = cfs.user_id
    JOIN CalculateMonetaryScores cms ON crs.user_id = cms.user_id
),
RfmScoreDistribution AS (
    SELECT 
        CASE 
                WHEN total_score < 5 THEN '低价值用户'
                WHEN total_score BETWEEN 6 AND 8 THEN '中价值用户'
                ELSE '高价值用户'
             END AS score_range,
        COUNT(*) AS cnt
    FROM UserRfmScores
    GROUP BY CASE 
                WHEN total_score < 5 THEN '低价值用户'
                WHEN total_score BETWEEN 6 AND 8 THEN '中价值用户'
                ELSE '高价值用户'
             END
)
SELECT 
    score_range,
    CONCAT(REPEAT('■', cnt), ' ', cnt) AS user_distribution
FROM RfmScoreDistribution
2025-04-21 找出流失风险客户(R=1且F<=2) 
WITH RecentOrder AS (
    SELECT cust_uid, DATEDIFF(CURRENT_DATE(), MAX(trx_dt)) AS recency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
ActiveDaysFrequency AS (
    SELECT cust_uid, COUNT(DISTINCT DATE(trx_dt)) AS frequency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
AverageSpending AS (
    SELECT cust_uid, AVG(trx_amt) AS avg_monetary
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
CalculateRecencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY recency_days DESC) AS recency_score 
    FROM RecentOrder
),
CalculateFrequencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY frequency_days) AS frequency_score
    FROM ActiveDaysFrequency
),
CalculateMonetaryScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY avg_monetary) AS monetary_score
    FROM AverageSpending
),
UserRfmScores AS (
    SELECT 
        crs.user_id,
        crs.recency_score,
        cfs.frequency_score,
        cms.monetary_score
    FROM CalculateRecencyScores crs
    JOIN CalculateFrequencyScores cfs ON crs.user_id = cfs.user_id
    JOIN CalculateMonetaryScores cms ON crs.user_id = cms.user_id
)
SELECT 
    user_id,
    recency_score,
    frequency_score,
    monetary_score
FROM UserRfmScores
WHERE recency_score = 1 AND frequency_score <= 2
order by user_id
2025-04-21 统计每一类RFM用户的数量 
WITH RecentOrder AS (
    SELECT cust_uid, DATEDIFF(CURRENT_DATE(), MAX(trx_dt)) AS recency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
ActiveDaysFrequency AS (
    SELECT cust_uid, COUNT(DISTINCT DATE(trx_dt)) AS frequency_days
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
AverageSpending AS (
    SELECT cust_uid, AVG(trx_amt) AS avg_monetary
    FROM mt_trx_rcd_f
    GROUP BY cust_uid
),
CalculateRecencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY recency_days DESC) AS recency_score 
    FROM RecentOrder
),
CalculateFrequencyScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY frequency_days) AS frequency_score
    FROM ActiveDaysFrequency
),
CalculateMonetaryScores AS (
    SELECT 
        cust_uid AS user_id,
        NTILE(3) OVER (ORDER BY avg_monetary) AS monetary_score
    FROM AverageSpending
),
UserRfmScores AS (
    SELECT 
        crs.user_id,
        CONCAT('R', crs.recency_score, 'F', cfs.frequency_score, 'M', cms.monetary_score) AS rfm_category
    FROM CalculateRecencyScores crs
    JOIN CalculateFrequencyScores cfs ON crs.user_id = cfs.user_id
    JOIN CalculateMonetaryScores cms ON crs.user_id = cms.user_id
),
RfmCategoryCount AS (
    SELECT 
        rfm_category,
        COUNT(*) AS user_count
    FROM UserRfmScores
    GROUP BY rfm_category
)
SELECT 
    rfm_category,
    user_count
FROM RfmCategoryCount
ORDER BY rfm_category;