其他
你可能也会掉进这个简单的 String 的坑
The following article is from 程序猿石头 Author 石头哥
作者 | 程序猿石头
责编 | 晋兆雨
头图 | 付费下载于视觉中国
关于作者:程序猿石头(ID: tangleithu),现任阿里巴巴技术专家,清华学渣,前大疆后端 Leader。
/**
* @param status
* @param result, the size should less than 1000 bytes
* @throws Exception
*/
public XXResult(boolean status, String result) {
if (result != null && result.getBytes().length > 1000) {
throw new RuntimeException("result size more than 1000 bytes!");
}
......
}
/**
* 将给定的字符串 trim 到指定大小
* @param input
* @param trimTo 需要 trim 的字节长度
* @return trim 后的 String
*/
public static String trimAsByte(String input, int trimTo) {
if (Objects.isNull(input)) {
return null;
}
byte[] bytes = input.getBytes();
if (bytes.length > trimTo) {
byte [] subArray = Arrays.copyOfRange(bytes, 0, trimTo);
return new String(subArray);
}
return input;
}
trimAsByte("WeChat:tangleithu", 8)
WeChat:tangleithu
太长了,只 trim 到剩下 8 个字节,对应的字节数组是从 [87,101,67,104,97,116,58,116,97,110,103,108,101,105,116,104,117]
变为了 [87,101,67,104,97,116,58,116]
,字符串变成了 WeChat:t
,结果正确。trimAsByte("程序猿石头", 8)
/**
* Constructs a new {@code String} by decoding the specified array of bytes
* using the platform's default charset. The length of the new {@code
* String} is a function of the charset, and hence may not be equal to the
* length of the byte array.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the default charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @since JDK1.1
*/
public String(byte bytes[]) {
//this(bytes, 0, bytes.length);
checkBounds(bytes, offset, length);
this.value = StringCoding.decode(bytes, offset, length);
}
[-25,-88,-117,-27,-70,-113,-25,-116,-65,-25,-97,-77,-27,-92,-76]
仍然用这串字节数组来实验,这串字节数组,如果用 “UTF-8” 编码去解释,那么其想表达的语义就是中文“程序猿石头”,从上文标注的 1,2,3 中可以看出来,没有写即用了系统中的默认编码“UTF-8”。总结