1、问题描述

  最近在做有关大数据的项目中,解码客户端的请求参数时抛出异常:

  java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern

详细错误如下:

java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern

at java.net.URLDecoder.decode(URLDecoder.java:187)

at com.z.transformer.util.LoggerUtil.handleRequestBody(LoggerUtil.java:122)

at com.z.transformer.util.LoggerUtil.handleLogText(LoggerUtil.java:59)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:47)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:20)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2019-04-16 18:49:28,087 WARN [main] com.z.transformer.util.LoggerUtil: 解码失败:tt=%E6%A8%AA%E5%BA%97%E5%A5%BD%E5%9C%A8%E5%93%AA%E9%87%8C%EF%BC%9F%E4%B8%BA%E4%BB%80%E4%B9%88%E5%BE%88%E5%A4%9A%E6%98%8E%E6%98%9F%E9%83%BD%E5%8E%BB%E6%A8%AA%E5%BA%97%E6%8B%8D%E6%88%8F_%E9%A9%B4%E5%A6%88%E5%A6%88%E6%97%85%E6%B8%B8%E8%B5%84%

java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern

at java.net.URLDecoder.decode(URLDecoder.java:187)

at com.z.transformer.util.LoggerUtil.handleRequestBody(LoggerUtil.java:122)

at com.z.transformer.util.LoggerUtil.handleLogText(LoggerUtil.java:59)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:47)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:20)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2019-04-16 18:49:31,067 WARN [main] com.z.transformer.util.LoggerUtil: 解码失败:p_url=http%3A%2F%2Fwww.baidu.com%2Finfo%2

java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern

at java.net.URLDecoder.decode(URLDecoder.java:187)

at com.z.transformer.util.LoggerUtil.handleRequestBody(LoggerUtil.java:122)

at com.z.transformer.util.LoggerUtil.handleLogText(LoggerUtil.java:59)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:47)

at com.z.transformer.mr.etl.AnalysisDataMapper.map(AnalysisDataMapper.java:20)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

截图如下:

  

我的主要代码如下:(代码加重部分)

/**

* 处理请求参数

* 处理结果保存到参数 result 集合(Map 集合)

*

* @param clientInfo

* 保存最终用户行为数据的 map 集合

* @param requestBody

* 请求参数中,用户行为数据,格式为:

* u_nu=1&u_sd=6D4F89C0-E17B-45D0-BFE0-059644C1878D&c_time=

* 1450569596991&ver=1&en=e_l&pl=website&sdk=js&b_rst=1440*900&

* u_ud=4B16B8BB-D6AA-4118-87F8-C58680D22657&b_iev=Mozilla%2F5.0%

* 20(Windows%20NT%205.1)%20AppleWebKit%2F537.36%20(KHTML%2C%

* 20like%20Gecko)%20Chrome%2F45.0.2454.101%20Safari%2F537.36&l=

* zh-CN&bf_sid=33cbf257-3b11-4abd-ac70-c5fc47afb797_11177014

*/

private static void handleRequestBody(Map clientInfo, String requestBody) {

// 将请求参数体按照 & 切割

String[] parameters = requestBody.split("&");

for (String parameter : parameters) {

// 循环处理参数,parameter 格式为: c_time=1450569596991 = 只会出现一次

String[] params = parameter.split("=");

String key, value = null;

try {

// 使用 utf-8 解码

key = URLDecoder.decode(params[0].trim(), "UTF-8");

value = URLDecoder.decode(params[1].trim(), "UTF-8");

// 添加到结果集合 Map 中

clientInfo.put(key, value);

} catch (Exception e) {

logger.warn("解码失败:" + parameter, e);

}

}

}

出现这个错误的主要原因是,在接收字段 params[0].trim() 的值时,浏览器传过来的值有时候是如下这样的:

b_iev=Mozilla%2F5.0%20(Windows%20NT%205.1)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F45.0.2454.101%20Safari%2F537.36

主要原因是% 在URL中是特殊字符,需要特殊转义一下。

2、解决办法

  我们只需要将传入后台的参数字符在decode之前使用 replaceAll("%","%25") 一下即可:

  这里强调下,网上很多的改法如下:

  

  上面这种写法是错误的,因为其把正确的都替换掉了。 

  这里我们需要明白,要替换掉的是单独出现的百分号,而不是全部的百分号。replaceAll("%(?![0-9a-fA-F]{2})","%25")

  正确改法:

   

  讲解下 %(?![0-9a-fA-F]{2}): 这是个正则表达式,含义是:不匹配 %后面两位为数字或字母(包括大小写)的字符;这样就把正确的排除掉了,剩下的就是需要匹配替换的。

3、效果

  

4、参考链接

  https://blog.csdn.net/u013066244/article/details/79552732

查看原文