使用 Spring Boot 应用集成 ClickHouse 打造数据分析利器
使用 Spring Boot 应用集成 ClickHouse 打造数据分析利器
在当今竞争激烈的商业环境中,企业对于数据的依赖程度与日俱增,数据分析已成为决策制定和业务优化的关键驱动力。Spring Boot 作为一款备受青睐的 Java 框架,以其简洁高效的特性为构建强大的应用提供了坚实基础。而将 ClickHouse 与 Spring Boot 集成,能够打造出功能强大的数据分析利器。
ClickHouse 是一款专为联机分析处理(OLAP)工作负载而设计的高性能列式数据库管理系统。它在处理大规模数据时展现出了卓越的性能和效率。
ClickHouse 具有诸多显著特性。首先,其采用了列式存储结构,这意味着相同类型的数据被紧密存储在一起,大大提高了数据压缩比和查询效率。对于大规模数据的复杂查询,ClickHouse 能够迅速响应,这得益于其优化的索引机制和高效的计算引擎。此外,它支持丰富的函数和操作符,能够灵活地处理各种数据处理需求。而且,ClickHouse 具有良好的可扩展性,可以轻松应对不断增长的数据量和查询负载。
为了更好地了解用户行为和优化服务,我们还将增加用户访问记录的统计与分析功能。通过记录用户的访问时间、访问页面、操作行为等信息,深入挖掘用户的兴趣和行为模式。
例如,假设我们运营一个电商网站。通过对用户访问记录的分析,我们发现每天下午 3 点到 5 点之间,商品详情页的访问量明显高于其他时间段。进一步分析发现,这段时间内访问的用户主要来自特定的地区,并且对某一类特定商品表现出了浓厚兴趣。基于这些数据,我们可以在这个时间段针对该地区的用户推出相关商品的促销活动,从而提高销售额。
项目运行效果页:
user_access_records
表的 DDL 定义:
CREATE TABLE user_access_records (
id UInt64,
access_time DateTime,
region String,
page String,
product String
) ENGINE = MergeTree()
ORDER BY (access_time);
向 user_access_records
表添加 20 条测试数据的示例:
INSERT INTO user_access_records (id, access_time, region, page, product)
VALUES
(1, '2024-08-13 10:00:00', '北京', '商品详情页', '手机'),
(2, '2024-08-13 11:00:00', '上海', '首页', '笔记本电脑'),
(3, '2024-08-13 12:00:00', '广州', '商品列表页', '耳机'),
(4, '2024-08-13 13:00:00', '深圳', '商品详情页', '平板'),
(5, '2024-08-13 14:00:00', '成都', '购物车页', '笔记本电脑'),
(6, '2024-08-13 15:00:00', '杭州', '商品详情页', '相机'),
(7, '2024-08-13 16:00:00', '武汉', '订单页', '音箱'),
(8, '2024-08-13 17:00:00', '重庆', '商品列表页', '手机'),
(9, '2024-08-13 18:00:00', '南京', '商品详情页', '笔记本电脑'),
(10, '2024-08-13 19:00:00', '天津', '首页', '笔记本电脑'),
(11, '2024-08-13 20:00:00', '西安', '商品详情页', '显示器'),
(12, '2024-08-13 21:00:00', '长沙', '商品列表页', '平板'),
(13, '2024-08-13 22:00:00', '青岛', '商品详情页', '路由器'),
(14, '2024-08-13 23:00:00', '大连', '订单页', '手机'),
(15, '2024-08-14 00:00:00', '厦门', '商品详情页', '笔记本电脑'),
(16, '2024-08-14 01:00:00', '沈阳', '商品列表页', '移动硬盘'),
(17, '2024-08-14 02:00:00', '济南', '商品详情页', '平板电脑'),
(18, '2024-08-14 03:00:00', '福州', '购物车页', '平板电脑'),
(19, '2024-08-14 04:00:00', '合肥', '商品详情页', '电视'),
(20, '2024-08-14 05:00:00', '郑州', '首页', '平板');
项目文件结构:
project-root/
src/
main/
java/
com/
icoderoad/
UserAccessAnalysisApplication.java // Spring Boot 启动类
controller/
UserAccessController.java // 控制器类
service/
UserAccessAnalysisService.java // 服务类
resources/
application.yaml // 配置文件
static/
js/
index.js
index.html // 静态资源,如 CSS、JS、图片等
templates/
index.html
pom.xml
项目依赖配置(pom.xml)
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.3.2</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>om.icoderoad</groupId>
<artifactId>clickhouse-integration</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>clickhouse-integration</name>
<description>clickhouse-integration project for Spring Boot</description>
<properties>
<java.version>17</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<!-- Spring Boot 核心依赖,提供基础的 Web 支持 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- ClickHouse 的 JDBC 驱动,用于与数据库建立连接和执行操作 -->
<dependency>
<groupId>ru.yandex.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.3.1</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<!-- Thymeleaf 模板引擎,用于构建动态的前端页面 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
配置文件(application.yaml)
spring:
datasource:
url: jdbc:clickhouse://localhost:8123/default
username: default
password:
driver-class-name: ru.yandex.clickhouse.ClickHouseDriver
# 配置日志级别,以便在开发过程中更好地调试
logging:
level:
root: INFO
aop:
auto: true
后端代码示例
应用启动类:ClickhouseIntegrationApplication :
package om.icoderoad.clickhouse_integration;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class ClickhouseIntegrationApplication {
public static void main(String[] args) {
SpringApplication.run(ClickhouseIntegrationApplication.class, args);
}
}
UserAccessAnalysisService 类:
package om.icoderoad.clickhouse_integration.service;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Service;
import java.util.HashMap;
import java.util.Map;
@Service
public class UserAccessAnalysisService {
private JdbcTemplate jdbcTemplate;
public UserAccessAnalysisService(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
}
public Map<String, Object> analyzeUserAccessWithParams(
String startTime, String endTime, String region, String productType) {
Map<String, Object> result = new HashMap<>();
// 统计下午 3 点到 5 点之间商品详情页的访问量
String sql = "SELECT COUNT(*) as detailPageVisits FROM user_access_records " +
"WHERE access_time >= '" + startTime + "' AND access_time <= '" + endTime + "' " +
"AND page = '商品详情页'";
int detailPageVisits = jdbcTemplate.queryForObject(sql, Integer.class);
result.put("detailPageVisits", detailPageVisits);
// 统计这段时间内来自特定地区的访问量
sql = "SELECT COUNT(*) as regionVisits FROM user_access_records " +
"WHERE access_time >= '" + startTime + "' AND access_time <= '" + endTime + "' " +
"AND region = '" + region + "'";
int regionVisits = jdbcTemplate.queryForObject(sql, Integer.class);
result.put("regionVisits", regionVisits);
// 统计对特定商品的访问量
sql = "SELECT COUNT(*) as specificProductVisits FROM user_access_records " +
"WHERE access_time >= '" + startTime + "' AND access_time <= '" + endTime + "' " +
"AND product = '" + productType + "'";
int specificProductVisits = jdbcTemplate.queryForObject(sql, Integer.class);
result.put("specificProductVisits", specificProductVisits);
return result;
}
}
视图显示类 UserAccessController
package om.icoderoad.clickhouse_integration.controller;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
@Controller
@RequestMapping("/")
public class UserAccessController {
@GetMapping("/statistics")
public String index(Model model) {
return "index";
}
}
接口类 UserAccessRestController
package om.icoderoad.clickhouse_integration.controller;
import java.util.Map;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import om.icoderoad.clickhouse_integration.service.UserAccessAnalysisService;
@RestController
public class UserAccessRestController {
@Autowired
private UserAccessAnalysisService userAccessAnalysis;
@GetMapping("/accessStatistics")
public Map<String, Object> accessStatistics(
@RequestParam("startTime") String startTime,
@RequestParam("endTime") String endTime,
@RequestParam("region") String region,
@RequestParam("productType") String productType) {
return userAccessAnalysis.analyzeUserAccessWithParams(startTime, endTime, region, productType);
}
}
前端页面 index.html:
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
<meta charset="UTF-8">
<title>访问量统计</title>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css">
<script src="https://cdn.jsdelivr.net/npm/echarts@5.5.1/dist/echarts.min.js"></script>
<!-- 引入 jQuery -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="/js/index.js"></script>
<style>
body {
font-family: Arial, sans-serif;
background-color: #f8f9fa;
}
.container {
padding: 20px;
}
h1 {
color: #343a40;
}
form label {
font-weight: bold;
}
input[type="datetime-local"],
input[type="text"] {
border-radius: 5px;
}
button {
background-color: #007bff;
color: white;
border: none;
border-radius: 5px;
padding: 10px 20px;
}
button:hover {
background-color: #0056b3;
cursor: pointer;
}
</style>
</head>
<body>
<div class="container">
<h1 class="mt-5">访问量统计</h1>
<form class="mt-3" action="/accessStatistics" method="get">
<div class="row">
<div class="col-md-3">
<label for="startTime">开始时间</label>
<input type="datetime-local" class="form-control" id="startTime" name="startTime">
</div>
<div class="col-md-3">
<label for="endTime">结束时间</label>
<input type="datetime-local" class="form-control" id="endTime" name="endTime">
</div>
<div class="col-md-3">
<label for="region">地区</label>
<input type="text" class="form-control" id="region" name="region">
</div>
<div class="col-md-3">
<label for="productType">商品类型</label>
<input type="text" class="form-control" id="productType" name="productType">
</div>
</div>
<button type="submit" class="btn btn-primary mt-3">统计</button>
</form>
<div class="mt-5" id="chartContainer" style="width: 800px; height: 600px;">
<div id="chart1" style="width: 800px; height: 240px;"></div>
<div id="chart2" style="width: 800px; height: 240px;"></div>
<div id="chart3" style="width: 800px; height: 240px;"></div>
</div>
</div>
</body>
</html>
JavaScript index.js 代码:
$(document).ready(function() {
$('#chartContainer').hide();
$('form').submit(function(event) {
event.preventDefault();
var startTime = $('#startTime').val();
startTime = startTime.replace('T', ' ') + ":00";
var endTime = $('#endTime').val();
endTime = endTime.replace('T', ' ') + ":00";
var region = $('#region').val();
var productType = $('#productType').val();
$.ajax({
url: '/accessStatistics',
type: 'GET',
data: {
startTime: startTime,
endTime: endTime,
region: region,
productType: productType
},
success: function(data) {
var specificProductVisits = data.specificProductVisits;
var accessCount = data.detailPageVisits;
var regionVisits = data.regionVisits;
console.log("accessCount:", accessCount)
var total = specificProductVisits + accessCount + regionVisits;
$('#chartContainer').show();
console.log("chart1:", document.getElementById('chart1'));
var chart1 = echarts.init(document.getElementById('chart1'));
// 指定图表的配置项和数据
var option1 = {
title: {
text: '访问量统计'
},
tooltip: {},
legend: {
data: ['访问量']
},
xAxis: {
data: ['特定商品', '详细页', '区域']
},
yAxis: {},
series: [
{
name: '访问',
type: 'bar',
data: [specificProductVisits, accessCount, regionVisits]
}
]
};
chart1.setOption(option1);
var chart2 = echarts.init(document.getElementById('chart2'));
var option2 = {
title: {
text: '访问量统计'
},
tooltip: {},
legend: {
data: ['访问量', '占比']
},
xAxis: {
data: ['访问量']
},
yAxis: [
{
name: '访问量',
type: 'value'
},
{
name: '占比',
type: 'value',
axisLabel: {
formatter: '{value}%'
}
}
],
series: [
{
name: '访问量',
type: 'bar',
data: [accessCount]
},
{
name: '占比',
type: 'line',
data: [Math.round(accessCount / total * 100)]
}
]
};
chart2.setOption(option2);
var chart3 = echarts.init(document.getElementById('chart3'));
var option3 = {
title: {
text: '访问量统计'
},
tooltip: {},
legend: {
data: ['访问量']
},
series: [
{
name: '访问量',
type: 'pie',
radius: '50%',
data: [
{
value: specificProductVisits,
name: '商品'
},
{
value: accessCount,
name: '详细页'
},
{
value: regionVisits,
name: '区域量'
}
]
}
]
};
chart3.setOption(option3);
}
});
});
});
总结:
本文详细介绍了使用 Spring Boot 应用集成 ClickHouse 进行数据分析的项目。包括表结构定义、数据插入示例、项目文件结构、依赖配置、配置文件、后端和前端代码等。通过提供全面的技术实现和代码示例,展示了如何利用相关技术实现用户访问记录的统计与分析功能。并且,在前端代码中,增加了折柱混合、圆角环形图两种图表样式,使数据展示更加直观和丰富。