干货 | 携程客服机器人ASR引擎的负载均衡实践
作者简介
玉修,携程技术专家,专注于电话音视频通信、智能客服机器人等领域。
用户侧:智能在线聊天机器人(IM)、智能语音导航/智能语音客服机器人/智能邀评插件(电话) 客服侧:智能工单和排班系统、智能质检系统、智能客户资源管理系统、服务渠道智能化 系统基建:平台部署智能化、业务监控智能化
MRCP客户端:发送RTP和SIP/MRCP的发起者,如FreeSWITCH(下文简称FS MRCP服务端:处理MRCP/SIP信令,接收并转发RTP ASR引擎 :解析RTP,将语音转换成文本,并返回给MRCP Server
首先,对于FS和AX设备相对固定的情况下,SIP请求的IP四元组(Source IP、Source port、Destination IP、Destination port)不会发生变化,因为FS对接MRCP Server时,会在MRCP配置文件中指定客户端和服务端的IP/Port,所以AX每次分配给FS的MRCP Server都是同一台,这显然不符合负载均衡的预期。
其次,电话场景,在收到200 OK后,可能长达半小时不会再有SIP交互,期间的MRCP和RTP都是MRCP-Client和MRCP-Server之间进行直连交互,根本不经过AX设备,而AX设备默认的会话保持时长为120秒,超过这个时间,SIP通道会被AX关闭,这会导致后续的SIP无法送达。
方案A:通过FreeSWITCH的distributor模块实现 方案B:通过OpenSIPs实现
优点 | 缺点 | |
1、无需依赖第三方负载均衡组件 | 1、配置繁琐复杂 2、MRCP Server节点增删,都需要调整FS配置文件,而且得在无ASR业务时,才能加载生效 3、端口数量消耗大(每个MRCP Server都需要单独分配端口段) 4、负载均衡策略相对单一,只支持按比例分配。而且单机所占有的最小比例不能小于0 | |
1、配置简单 2、MRCP Server节点增删,只需调整OpenSIPs的DB即可,有ASR调用时,也可更改,实时生效 3、端口数量消耗小(只需要配置一个MRCP Profile文件,多个MRCP Server共用端口段) 4、负载均衡方案多种多样,支持按比例、轮询等多种方式 | 1、需要依赖第三方负载均衡组件OpenSIPs |
mod_unimrcp mod_distributor mod_dptools: play_and_detect_speech Load Balancer Module Dispatcher Module Dialplan Module
在 /usr/local/freeswitch/conf/mrcp_profiles/下配置FS对接MRCP Server的文件
tree /usr/local/freeswitch/conf/mrcp_profiles
├── mrcp1.xml
└── mrcp2.xml
<include>
<profile name="mrcp1" version="2"> 【每个MRCP Server这里配置的名称都不一样,但一定有一个相同名称的网关】
<param name="client-ip" value="192.168.1.99"/>
<param name="client-port" value="client-port-1"/>
<param name="server-ip" value="server-ip-1"/>
<param name="server-port" value="8060"/>
<param name="sip-transport" value="tcp"/> 【也可以是UDP哦】
<param name="rtp-ip" value="192.168.1.99"/>
<param name="rtp-port-min" value="min-port-1"/>
<param name="rtp-port-max" value="max-port-1"/>
<param name="ua-name" value="FreeSWITCH"/>
</profile>
</include>
在 /usr/local/freeswitch/conf/sip_profiles/external/下配置网关对接文件
/usr/local/freeswitch/conf/sip_profiles/external
├── mrcp1.xml
└── mrcp2.xml
<gateway name="mrcp1"> 【gateway的name要与mrcp_profile文件中profile的name一致,或可以按照某种规则转换】
<param name="username" value=""/>
<param name="proxy" value="mrcp1-server-ip:8060"/> 【当然这里端口可能是其他值】
<param name="realm" value="mrcp1-server-ip"/>
<param name="register" value="false"/>
<param name="rtp-autofix-timing" value="false"/>
<param name="caller-id-in-from" value="true"/>
<param name="ping" value="10"/> 【FS给proxy对应地址发送探测的周期】
<param name="ping-max" value="5"/>
<param name="ping-min" value="2"/>
</gateway>
vim /usr/local/freeswitch/conf/autoload_configs/distributor.conf.xml
<configuration name="distributor.conf" description="Distributor Configuration">
<lists>
<list name="mrcp"> 【权重配置成一样,相当于两个MCRP server 按 1:1 分配】
<node name="mrcp1" weight="5"/> 【node name值与sip gateway 名称相同】
<node name="mrcp2" weight="5"/>
</list>
</lists>
</configuration>
freeswitch@LPT0596> sofia profile external gwlist down 【获取宕机的网关】
mrcp2
freeswitch@LPT0596> expand eval ${distributor mrcp ${sofia profile external gwlist down}} 【将宕机的网关排除在外后,获取分配的SM节点】
mrcp1
freeswitch@LPT0596> expand eval ${distributor mrcp ${sofia profile external gwlist down}} 【如果mrcp2没有宕机,这里将返回mrcp2】
mrcp1
使用得到的MRCP Server Profile名称执行ASR命令:play_and_detect_speech
/usr/local/freeswitch/sounds/ivr_prompt_voice.wav detect:unimrcp:mrcp1
{start-input-timers=false,no-input-timeout=10000,recognition-timeout=10000}ahlt_ats
reload mod_unimrcp : 修改FS与MRCP server对接的文件后,重新加载生效【只有当前没有正在执行的ASR操作时,才能重加载】 sofia profile external rescan : 重新加载FS的网关配置
问题1的解决方法
INVITE sip:192.168.1.18:5070 SIP/2.0
Via: SIP/2.0/UDP 192.168.1.99:5102;rport;branch=z9hG4bKQ21yZS46ytrgF
Max-Forwards: 70
From: <sip:192.168.1.99:5102>;tag=4B8SvQe66FNvc
To: <sip:192.168.1.18:5070>
Call-ID: ed9f5f6b-0673-123b-199a-fa163e72d95e
CSeq: 47770741 INVITE
Contact: <sip:192.168.1.99:5102>
User-Agent: FreeSWITCH
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, PRACK, MESSAGE, SUBSCRIBE, NOTIFY, REFER, UPDATE
Supported: timer, 100rel
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 306
v=0
o=FreeSWITCH 2480643166757753319 6144298267054033408 IN IP4 192.168.1.99
s=-
c=IN IP4 192.168.1.99
t=0 0
m=application 9 TCP/MRCPv2 1
a=setup:active
a=connection:existing
a=resource:speechrecog
a=cmid:1
m=audio 31799 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=sendonly
a=mid:1
根据INVITE请求的源IP:不可行,因为同一个源IP可能发起多种请求的INVITE,比如FS可能是请求执行ASR,也可能是请求呼叫手机;此外,即使可行,源IP也不方便维护。 根据INVITE请求的目的IP:不可行,所有INVITE请求的该值都一样 根据INVITE请求的User-Agent头:可行,OpenSIPs通过$ua就能获取该值。虽然不能针对每次INVITE自定义不同的UA头,但FS对接MRCP Server的Profile中可以指定一个统一的User-Agent头,默认是FreeSWITCH。 根据INVITE请求SDP信息中的‘m’头:可行,OpenSIPs通过$(rb{sdp.line,m})就能获取该值。如 上面报文中“m=application 9 TCP/MRCPv2 1” 里面有MRCPv2,可根据这个判断是执行ASR。
问题2的解决方法
优点 | 缺点 | |
可控制每个MRCP Server的最大并发量 支持监控分配给每个MRCP Server的实时并发量 | 分配策略单一:只支持空闲优先策略分配和按比例分配两种策略,无法支持记忆轮训,这就导致但MRCP Server集群新增成员时,会将流量全部分配给新增的机器,这种情况,新机器的突增压力可能较大 | |
分配策略多种多样:如支持记忆轮训、Hash分配等 | 不能控制每个MRCP Server的最大并发量,话务量暴涨时,存在雪崩隐患 不能监控分配给每个MRCP Server的实时并发量(但可以自行通过OpenSIPs其他模块实现) |
问题3的解决方法
baidu_mrcp_lb.xml 下面只给出特有配置,其他配置被省略了
<include>
<profile name="baidu_mrcp_lb" version="2"> 【阿里的配置,name为ali_mrcp_lb】
<param name="server-ip" value="opensips-ip"/>
<param name="ua-name" value="ASR_MRCP_CLIENT_FS_BAIDU"/> 【阿里的配置,ua-name为ASR_MRCP_CLIENT_FS_ALI】
<param name="sdp-origin" value="FS_MRCP"/>
</profile>
</include>
OpenSIPs给MRCP Server做负载均衡的处理流程图如下:依赖dialplan模块进行选择具体通过哪个模块来执行LB。
数据库初始化
dialplan的attrs字段被赋予了特殊用途
INSERT INTO `dialplan`(`dpid`,`pr`,`match_op`,`match_exp`,`match_flags`,`subst_exp`,`repl_exp`,`timerec`,`disabled`,`attrs`) VALUES
(90,1000,1,'^ASR_MRCP_CLIENT_CTRIP_FS_ALI$',0,NULL,NULL,NULL,0,'90:DS:ASR_MRCP_SERVER_CTRIP_ALI'),
(90,1000,1,'^ASR_MRCP_CLIENT_CTRIP_FS_BAIDU$',0,NULL,NULL,NULL,0,'91:LB:ASR_MRCP_SERVER_BAIDU');
dispatcher的attrs字段没有实际作用
INSERT INTO `dispatcher` (`setid`, `destination`, `state`, `weight`, `priority`, `attrs`, `description`) VALUES
(90, 'sip:192.168.1.190:8060', 0, 1, 100, 'pstn=100', 'IDC_A:ASR_MRCP_SEVER_ALI'),
(90, 'sip:192.168.1.191:8060', 0, 1, 100, 'pstn=100', 'IDC_A:ASR_MRCP_SEVER_ALI'),
(10090, 'sip:192.168.2.198:8060', 0, 1, 100, 'pstn=100', 'IDC_B:ASR_MRCP_SEVER_ALI');
load_balancer的resources字段可以控制最大并发数
INSERT INTO `load_balancer`(`group_id`,`dst_uri`,`resources`,`probe_mode`,`description`) VALUES
(91,'sip:192.168.1.180:8060','pstn=50',2,'IDC_A:ASR_MRCP_SEVER_BAIDU'),
(91,'sip:192.168.1.181:8060','pstn=50',2,'IDC_A:ASR_MRCP_SEVER_BAIDU'),
(10091,'sip:192.168.2.188:8060','pstn=50',2,'IDC_B:ASR_MRCP_SEVER_BAIDU');
sudo /usr/local/opensips/sbin/opensipsctl fifo lb_list
Destination:: sip:192.168.1.180:8060 id=1 group=90 enabled=yes auto-reenable=on
Resources::
Resource:: pstn max=50 load=50
Destination:: sip:192.168.1.180:8060 id=2 group=90 enabled=no auto-reenable=on
Resources::
Resource:: pstn max=50 load=0
Destination:: sip:192.168.2.188:8060 id=3 group=90 enabled=yes auto-reenable=on
Resources::
Resource:: pstn max=50 load=10
sudo /usr/local/opensips/sbin/opensipsctl fifo ds_list
PARTITION:: default
SET:: 90
URI:: sip:192.168.1.190:5080 state=Active first_hit_counter=8
attr:: pstn=500
URI:: sip:192.168.1.191:5080 state=Inactive first_hit_counter=0
attr:: pstn=500
SET:: 10090
URI:: sip:192.168.2.198:8060 state=Active first_hit_counter=0
attr:: pstn=100
OpenSIPs代码实现
route{
#省略N多代码...
# check sip INVITE message source ip and port
if (is_method("INVITE")) {
xlog("ua = $ua , callid = $ci, fu = $fu , tu = $tu , ru = $ru , du =$du src:$si, $(rb{sdp.line,m}))");
$var(dlgPingTag) = "Pp";
if ( $ua == "ASR_MRCP_CLIENT_FS" ) { #to_asr_mrcp_server
$var(dlgPingTag) = ""; # ASR 的SIP通道不能做OPTION探测
}
if ( !create_dialog("$var(dlgPingTag)") ) {
route(PRINT_LOG, "create_dialog error : Internal Server Error");
send_reply("500","SM Internal Server Error");
exit();
}
if ( $ua =~ "^ASR_MRCP_CLIENT_CTRIP_FS*" ) { #to_asr_mrcp_server 【需要修改FS mrcp client配置文件,<param name="ua-name" value="ASR_MRCP_CLIENT_FS..."/>】
if ( dp_translate("90", "$ua/$avp(dest)", "$var(attrs)") ) { #拨号方案判断
route(exeLb, $(var(attrs){s.int}), "pstn", $(var(attrs){s.select, 1,:}), $(var(attrs){s.select, 2,:}));
}
} else { #处理其他呼叫类型,如呼叫手机等
#省略N多代码...
}
}
exit();
}
#usage : route(exeLb, lb_group_id, resource_type, node_type, lb_method)
#e.g. route(exeLb, 90, "pstn", "ASR_MRCP_SERVER_ALI", "LB")
#e.g. route(exeLb, 90, "pstn", "ASR_MRCP_SERVER_ALI", "DS")
route[exeLb]{
$var(lb_group_id) = $param(1);
$var(lb_group_id_bak) = $param(1) + 10000;
$var(resource_type) = $param(2);
$var(node_type) = $param(3);
$var(lb_method) = $param(4);
xlog("[$fU->$rU] Route $rU to '$var(node_type)' by load_balancer group_id : '$var(lb_group_id)' [back_group_id:'$var(lb_group_id_bak)'], resource_type : '$var(resource_type)', node_type : '$var(node_type)' [ci:$ci] [xcid:$hdr(X-CID)]");
$var(lbRst) = 0;
if( $var(lb_method) == "DS" ) {
$var(lbRst) = ds_select_dst("$var(lb_group_id)", "4");
if($var(lbRst) == -1) {
xlog("[exeLb4CM] [$fU->$rU] Failed --->lbRst=$var(lbRst) Route $rU to '$var(node_type)' by dispatcher group_id : '$var(lb_group_id)', resource_type : '$var(resource_type)' [ci:$ci]");
$var(lbRst) = ds_select_dst("$var(lb_group_id_bak)", "4");
if(!$var(lbRst)) {
xlog("[exeLb4CM] [$fU->$rU] Failed ===>lbRst=$var(lbRst) Route $rU to '$var(node_type)' by dispatcher [back_group_id:'$var(lb_group_id_bak)'], resource_type : '$var(resource_type)' [ci:$ci]");
}
}
} else {
$var(lbRst) = lb_start_or_next("$var(lb_group_id)", "$var(resource_type)", "s");
if( $var(lbRst) < 0) {
xlog("[$fU->$rU] Failed --->lbRst=$var(lbRst) Route $rU to '$var(node_type)' by load_balancer group_id : '$var(lb_group_id)', resource_type : '$var(resource_type)' [ci:$ci]");
$var(lbRst) = lb_start("$var(lb_group_id_bak)", "$var(resource_type)", "s");
if( $var(lbRst) < 0) {
xlog("[$fU->$rU] Failed ===>lbRst=$var(lbRst) Route $rU to '$var(node_type)' by load_balancer [back_group_id:'$var(lb_group_id_bak)'], resource_type : '$var(resource_type)' [ci:$ci]");
}
}
}
if ( $var(lbRst) > 0) {
if ( $rU == null ) { #对于FS 发起的 MRCP INVITE 请求, $rU 为 null, 而不设置 $rU 将导致 Load balancer 失败,所以需要初始化一个值
xlog("[$fU->$rU] rU is null, then initialize to 'Null2Sm' [ci:$ci] [xcid:$hdr(X-CID)]");
#$rU = "Null2SM";
$ru = "sip:" + $(du{uri.host}) + ":" + $dp;
} else {
$ru = "sip:" + $rU + "@" + $(du{uri.host}) + ":" + $dp;
}
xlog("[$fU->$rU] Route to '$var(node_type)' --> [$du] [ci:$ci] [xcid:$hdr(X-CID)]");
route(relay);
} else {
xlog("[$fU->$rU] No available '$var(node_type)' now [ci:$ci] [xcid:$hdr(X-CID)]");
t_reply("480", "$var(node_type) Unavailable");
exit();
}
}
Feb 12 22:27:35 fat5410 /usr/local/opensips/sbin/opensips[3710]: ERROR:core:parse_uri: bad char '@' in state 0 parsed: <sip:> (4) / <sip:@192.168.1.190:8060> (20)
Feb 12 22:27:35 fat5410 /usr/local/opensips/sbin/opensips[3710]: ERROR:core:parse_sip_msg_uri: bad uri <sip:@192.168.1.190:8060>
Feb 12 22:27:35 fat5410 /usr/local/opensips/sbin/opensips[3710]: ERROR:core:pv_get_ruri_attr: failed to parse the R-URI
FS 发送INVITE给 OpenSIPs
2022-02-13 13:50:53 +0800 : 192.168.1.99:5221 -> 192.168.1.18:5070
INVITE sip:192.168.1.18:5070 SIP/2.0 你可以看到,这里没有被叫号码,所以到了OpenSIPs 后 $rU是null
Via: SIP/2.0/UDP 192.168.1.99:5221;rport;branch=z9hG4bKFUXeX5r0Q0gXp
Max-Forwards: 70
From: <sip:192.168.1.99:5221>;tag=3pU7FrrQBQ1NS
To: <sip:192.168.1.18:5070>
Call-ID: 5c97aeaf-64b4-123a-02b4-fa163ea03f01
CSeq: 38878534 INVITE
Contact: <sip:192.168.1.99:5221>
User-Agent: ASR_MRCP_CLIENT_FS_ALI
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, PRACK, MESSAGE, SUBSCRIBE, NOTIFY, REFER, UPDATE
Supported: timer, 100rel
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 299
v=0
o=FS_MRCP 1321904497415698834 1659019553944433241 IN IP4 192.168.1.99
s=-
c=IN IP4 192.168.1.99
t=0 0
m=application 9 TCP/MRCPv2 1
a=setup:active
a=connection:new
a=resource:speechrecog
a=cmid:1
m=audio 16416 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=sendonly
a=mid:1
OpenSIPS 转发INVITE给 MRCP server
2022-02-13 13:50:53 +0800 : 192.168.1.18:5070 -> 192.168.1.190:8060
INVITE sip:192.168.1.18:5070 SIP/2.0 [如果修改$rU, 这里就是 INVITE sip:Null2SM@192.168.1.18:5070 SIP/2.0]
Record-Route: <sip:192.168.1.18:5070;lr;did=de7.2517abb1>
Via: SIP/2.0/UDP 192.168.1.18:5070;branch=z9hG4bK9443.c1265465.0
Via: SIP/2.0/UDP 192.168.1.99:5221;received=192.168.1.99;rport=5221;branch=z9hG4bKFUXeX5r0Q0gXp
Max-Forwards: 69
From: <sip:192.168.1.99:5221>;tag=3pU7FrrQBQ1NS
To: <sip:192.168.1.18:5070>
Call-ID: 5c97aeaf-64b4-123a-02b4-fa163ea03f01
CSeq: 38878534 INVITE
Contact: <sip:192.168.1.99:5221>
User-Agent:ASR_MRCP_CLIENT_FS_ALI
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, PRACK, MESSAGE, SUBSCRIBE, NOTIFY, REFER, UPDATE
Supported: timer, 100rel
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 299
X-UUI: &XCID=0dc0196031626864653EXCIDEND
X-CID: 0dc0196031626864653EXCIDEND
v=0
o=FS_MRCP 1321904497415698834 1659019553944433241 IN IP4 192.168.1.99
s=-
c=IN IP4 192.168.1.99
t=0 0
m=application 9 TCP/MRCPv2 1
a=setup:active
a=connection:new
a=resource:speechrecog
a=cmid:1
m=audio 16416 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=sendonly
a=mid:1
MRCP Server 回复200 OK,返回后续接收RTP的真实地址
2022-02-13 13:50:53 +0800 : 192.168.1.190:8060 -> 192.168.1.18:5070
SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.168.1.18:5070;branch=z9hG4bK9443.c1265465.0
Via: SIP/2.0/UDP 192.168.1.99:5221;received=192.168.1.99;rport=5221;branch=z9hG4bKFUXeX5r0Q0gXp
Record-Route: <sip:192.168.1.18:5070;lr;did=de7.2517abb1>
From: <sip:192.168.1.99:5221>;tag=3pU7FrrQBQ1NS
To: <sip:192.168.1.18:5070>;tag=45D4K1DvpQQXK
Call-ID: 5c97aeaf-64b4-123a-02b4-fa163ea03f01
CSeq: 38878534 INVITE
Contact: <sip:192.168.1.190:8060>
User-Agent: BaiduSpeech SofiaSIP 1.5.0
Accept: application/sdp
Allow: INVITE, ACK, BYE, CANCEL, OPTIONS, PRACK, MESSAGE, SUBSCRIBE, NOTIFY, REFER, UPDATE
Supported: timer, 100rel
Session-Expires: 600;refresher=uac
Min-SE: 120
Content-Type: application/sdp
Content-Disposition: session
Content-Length: 303
v=0
o=BaiduSpeechServer 8512797916186481341 4497985761629564802 IN IP4 192.168.1.190 【接收RTP的IP】
s=-
c=IN IP4 192.168.1.190
t=0 0
m=application 1544 TCP/MRCPv2 1
a=setup:passive
a=connection:new
a=channel:b250b76cea1011eb@speechrecog
a=cmid:1
m=audio 18380 RTP/AVP 0 【接收RTP的端口】
a=rtpmap:0 PCMU/8000
a=recvonly
a=mid:1
FS发送ACK给OpenSIPs
2022-02-13 13:50:53 +0800 : 192.168.1.99:5221 -> 192.168.1.18:5070
ACK sip:192.168.1.190:8060 SIP/2.0
Via: SIP/2.0/UDP 192.168.1.99:5221;rport;branch=z9hG4bKFUXeX5r0Q0gXp
Route: <sip:192.168.1.18:5070;lr;did=355.53f8e331>
Max-Forwards: 70
From: <sip:192.168.1.99:5221>;tag=3pU7FrrQBQ1NS
To: <sip:192.168.1.18:5070>
Call-ID: 5c97aeaf-64b4-123a-02b4-fa163ea03f01
CSeq: 38878534 ACK
Contact: <sip:192.168.1.99:5221>
Content-Length: 0
最后,OpenSIPs将ACK转发给MRCP Server
五、结语
“携程技术”公众号
分享,交流,成长