其他
Spring Boot 实现 MySQL 百万级数据量导出并避免 OOM 的解决方案
参考:https://grokonez.com/spring-framework/spring-boot/excel-file-download-from-springboot-restapi-apache-poi-mysql。
OutofMemoryError
。我们为什么要导出这么多数据呢?谁傻到去看这么大的数据啊,这个设计是不是合理的呢? 怎么做好权限控制?百万级数据导出你确定不会泄露商业机密? 如果要导出百万级数据,那为什么不直接找大数据或者DBA来干呢?然后以邮件形式传递不行吗? 为什么要通过后端的逻辑来实现,不考虑时间成本,流量成本吗? 如果通过分页导出,每次点击按钮只导2万条,分批导出难道不能满足业务需求吗?
不能将全量数据一次性加载到内存之中。
以csv代替excel。
JPA实现百万级数据导出
具体方案不妨参考:http://knes1.github.io/blog/2015/2015-10-19-streaming-mysql-results-using-java8-streams-and-spring-data.html。
https://github.com/knes1/todo
Repository
之上。方法的返回类型定义成Stream。Integer.MIN_VALUE
告诉jdbc driver
逐条返回数据。@Query(value = "select t from Todo t")
Stream<Todo> streamAll();
@Transactional(readOnly = true)
,保证事物是只读的。javax.persistence.EntityManager
,通过detach从内存中移除已经使用后的对象。@Transactional(readOnly = true)
public void exportTodosCSV(HttpServletResponse response) {
response.addHeader("Content-Type", "application/csv");
response.addHeader("Content-Disposition", "attachment; filename=todos.csv");
response.setCharacterEncoding("UTF-8");
try(Stream<Todo> todoStream = todoRepository.streamAll()) {
PrintWriter out = response.getWriter();
todoStream.forEach(rethrowConsumer(todo -> {
String line = todoToCSV(todo);
out.write(line);
out.write("\n");
entityManager.detach(todo);
}));
out.flush();
} catch (IOException e) {
log.info("Exception occurred " + e.getMessage(), e);
throw new RuntimeException("Exception occurred while exporting results", e);
}
}
MyBatis实现百万级数据导出
ResultHandler
,然后在mapper.xml文件中,对应的select语句中添加fetchSize="-2147483648"
。MyBatis实现百万级数据导出的具体实例
MyBatis Stream
导出的完整的工程样例,我们将通过对比Stream文件导出和传统方式导出的内存占用率的差异,来验证Stream文件导出的有效性。DownloadProcessor
,它内部封装一个HttpServletResponse
对象,用来将对象写入到csv。private final HttpServletResponse response;
public DownloadProcessor(HttpServletResponse response) {
this.response = response;
String fileName = System.currentTimeMillis() + ".csv";
this.response.addHeader("Content-Type", "application/csv");
this.response.addHeader("Content-Disposition", "attachment; filename="+fileName);
this.response.setCharacterEncoding("UTF-8");
}
public <E> void processData(E record) {
try {
response.getWriter().write(record.toString()); //如果是要写入csv,需要重写toString,属性通过","分割
response.getWriter().write("\n");
}catch (IOException e){
e.printStackTrace();
}
}
}
org.apache.ibatis.session.ResultHandler
,自定义我们的ResultHandler
,它用于获取java对象,然后传递给上面的DownloadProcessor
处理类进行写文件操作:private final DownloadProcessor downloadProcessor;
public CustomResultHandler(
DownloadProcessor downloadProcessor) {
super();
this.downloadProcessor = downloadProcessor;
}
@Override
public void handleResult(ResultContext resultContext) {
Authors authors = (Authors)resultContext.getResultObject();
downloadProcessor.processData(authors);
}
}
private Integer id;
private String firstName;
private String lastName;
private String email;
private Date birthdate;
private Date added;
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName == null ? null : firstName.trim();
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName == null ? null : lastName.trim();
}
public String getEmail() {
return email;
}
public void setEmail(String email) {
this.email = email == null ? null : email.trim();
}
public Date getBirthdate() {
return birthdate;
}
public void setBirthdate(Date birthdate) {
this.birthdate = birthdate;
}
public Date getAdded() {
return added;
}
public void setAdded(Date added) {
this.added = added;
}
@Override
public String toString() {
return this.id + "," + this.firstName + "," + this.lastName + "," + this.email + "," + this.birthdate + "," + this.added;
}
}
List<Authors> selectByExample(AuthorsExample example);
List<Authors> streamByExample(AuthorsExample example); //以stream形式从mysql获取数据
}
fetchSize="-2147483648"
select
<if test="distinct">
distinct
</if>
'false' as QUERYID,
<include refid="Base_Column_List" />
from authors
<if test="_parameter != null">
<include refid="Example_Where_Clause" />
</if>
<if test="orderByClause != null">
order by ${orderByClause}
</if>
</select>
<select id="streamByExample" fetchSize="-2147483648" parameterType="com.alphathur.mysqlstreamingexport.domain.AuthorsExample" resultMap="BaseResultMap">
select
<if test="distinct">
distinct
</if>
'false' as QUERYID,
<include refid="Base_Column_List" />
from authors
<if test="_parameter != null">
<include refid="Example_Where_Clause" />
</if>
<if test="orderByClause != null">
order by ${orderByClause}
</if>
</select>
streamDownload
方法即为stream取数据写文件的实现,它将以很低的内存占用从MySQL获取数据;此外还提供traditionDownload
方法,它是一种传统的下载方式,批量获取全部数据,然后将每个对象写入文件。public class AuthorsService {
private final SqlSessionTemplate sqlSessionTemplate;
private final AuthorsMapper authorsMapper;
public AuthorsService(SqlSessionTemplate sqlSessionTemplate, AuthorsMapper authorsMapper) {
this.sqlSessionTemplate = sqlSessionTemplate;
this.authorsMapper = authorsMapper;
}
/**
* stream读数据写文件方式
* @param httpServletResponse
* @throws IOException
*/
public void streamDownload(HttpServletResponse httpServletResponse)
throws IOException {
AuthorsExample authorsExample = new AuthorsExample();
authorsExample.createCriteria();
HashMap<String, Object> param = new HashMap<>();
param.put("oredCriteria", authorsExample.getOredCriteria());
param.put("orderByClause", authorsExample.getOrderByClause());
CustomResultHandler customResultHandler = new CustomResultHandler(new DownloadProcessor (httpServletResponse));
sqlSessionTemplate.select(
"com.alphathur.mysqlstreamingexport.mapper.AuthorsMapper.streamByExample", param, customResultHandler);
httpServletResponse.getWriter().flush();
httpServletResponse.getWriter().close();
}
/**
* 传统下载方式
* @param httpServletResponse
* @throws IOException
*/
public void traditionDownload(HttpServletResponse httpServletResponse)
throws IOException {
AuthorsExample authorsExample = new AuthorsExample();
authorsExample.createCriteria();
List<Authors> authors = authorsMapper.selectByExample (authorsExample);
DownloadProcessor downloadProcessor = new DownloadProcessor (httpServletResponse);
authors.forEach (downloadProcessor::processData);
httpServletResponse.getWriter().flush();
httpServletResponse.getWriter().close();
}
}
@RequestMapping("download")
public class HelloController {
private final AuthorsService authorsService;
public HelloController(AuthorsService authorsService) {
this.authorsService = authorsService;
}
@GetMapping("streamDownload")
public void streamDownload(HttpServletResponse response)
throws IOException {
authorsService.streamDownload(response);
}
@GetMapping("traditionDownload")
public void traditionDownload(HttpServletResponse response)
throws IOException {
authorsService.traditionDownload (response);
}
}
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`last_name` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`email` varchar(100) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`birthdate` date NOT NULL,
`added` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=10095 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
存储过程 + 大杀器 select insert 语句
!不太懂?
链接:https://pan.baidu.com/s/1hqnWU2JKlL4Tb9nWtJl4sw 提取码:nrp0
jconsole.exe
http://localhost:8080/download/traditionDownload
。http://localhost:8080/download/streamDownload
,当下载开始后,内存占用也会有一个明显的上升,但是峰值才到500M。对比于上面的方式,内存占用率足足降低了80%!怎么样,兴奋了吗!感谢阅读,希望对你有所帮助 :) 来源: blog.csdn.net/haohao_ding/article/details/123164771
END
往期精彩Spring Boot 整合 ChatGPT API 项目实战,十分钟快速搞定!
SSO 单点登录和 OAuth2.0 的区别和理解
Spring Boot + minio 实现高性能存储服务
Spring Boot 优雅停止服务的几种方法