Loki 是 Grafana Labs 团队最新的开源项目,是一个水平可扩展,高可用性,多租户的日志聚合系统。它的设计非常经济高效且易于操作,因为它不会为日志内容编制索引,而是为每个日志流编制一组标签,专门为 Prometheus 和 Kubernetes 用户做了相关优化。该项目受 Prometheus 启发,官方的介绍就是:Like Prometheus,But For Logs.,类似于 Prometheus 的日志系统;
项目地址:https://github.com/grafana/loki/
wget https://github.com/grafana/loki/releases/download/v2.2.1/loki-linux-amd64.zipwget https://github.com/grafana/loki/releases/download/v2.2.1/promtail-linux-amd64.zip
$ mkdir /opt/app/{promtail,loki} -pv# promtail配置文件$ cat <<EOF> /opt/app/promtail/promtail.yamlserver:http_listen_port: 9080grpc_listen_port: 0positions:filename: /var/log/positions.yaml # This location needs to be writeable by promtail.client:url: http://localhost:3100/loki/api/v1/pushscrape_configs:- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: varlogshost: yourhost__path__: /var/log/*.logEOF# 解压安装包unzip promtail-linux-amd64.zipmv promtail-linux-amd64 /opt/app/promtail/promtail# service文件$ cat <<EOF >/etc/systemd/system/promtail.service[Unit]Description=promtail serverWants=network-online.targetAfter=network-online.target[Service]ExecStart=/opt/app/promtail/promtail -config.file=/opt/app/promtail/promtail.yamlStandardOutput=syslogStandardError=syslogSyslogIdentifier=promtail[Install]WantedBy=default.targetEOFsystemctl daemon-reloadsystemctl restart promtailsystemctl status promtail
$ mkdir /opt/app/{promtail,loki} -pv# promtail配置文件$ cat <<EOF> /opt/app/loki/loki.yamlauth_enabled: falseserver:http_listen_port: 3100grpc_listen_port: 9096ingester:wal:enabled: truedir: /opt/app/loki/wallifecycler:address: 127.0.0.1ring:kvstore:store: inmemoryreplication_factor: 1final_sleep: 0schunk_idle_period: 1h # Any chunk not receiving new logs in this time will be flushedmax_chunk_age: 1h # All chunks will be flushed when they hit this age, default is 1hchunk_target_size: 1048576 # Loki will attempt to build chunks up to 1.5MB, flushing first if chunk_idle_period or max_chunk_age is reached firstchunk_retain_period: 30s # Must be greater than index read cache TTL if using an index cache (Default index read cache TTL is 5m)max_transfer_retries: 0 # Chunk transfers disabledschema_config:configs:- from: 2020-10-24store: boltdb-shipperobject_store: filesystemschema: v11index:prefix: index_period: 24hstorage_config:boltdb_shipper:active_index_directory: /opt/app/loki/boltdb-shipper-activecache_location: /opt/app/loki/boltdb-shipper-cachecache_ttl: 24h # Can be increased for faster performance over longer query periods, uses more disk spaceshared_store: filesystemfilesystem:directory: /opt/app/loki/chunkscompactor:working_directory: /opt/app/loki/boltdb-shipper-compactorshared_store: filesystemlimits_config:reject_old_samples: truereject_old_samples_max_age: 168hchunk_store_config:max_look_back_period: 0stable_manager:retention_deletes_enabled: falseretention_period: 0sruler:storage:type: locallocal:directory: /opt/app/loki/rulesrule_path: /opt/app/loki/rules-tempalertmanager_url: http://localhost:9093ring:kvstore:store: inmemoryenable_api: trueEOF# 解压包unzip loki-linux-amd64.zipmv loki-linux-amd64 /opt/app/loki/loki# service文件$ cat <<EOF >/etc/systemd/system/loki.service[Unit]Description=loki serverWants=network-online.targetAfter=network-online.target[Service]ExecStart=/opt/app/loki/loki -config.file=/opt/app/loki/loki.yamlStandardOutput=syslogStandardError=syslogSyslogIdentifier=loki[Install]WantedBy=default.targetEOFsystemctl daemon-reloadsystemctl restart lokisystemctl status loki
grafana-loki-dashsource
在数据源列表中选择 Loki,配置 Loki 源地址:
server:http_listen_port: 9080grpc_listen_port: 0positions:filename: /tmp/positions.yamlclients:- url: http://loki:3100/loki/api/v1/pushscrape_configs:- job_name: systemstatic_configs:- targets:- localhostlabels:job: varlogs__path__: /var/log/*log
这里的 job 就是 varlog,文件路径就是 /var/log/*log
查看日志 rate({job="message"} |="kubelet"
算 qps rate({job=”message”} |=”kubelet” [1m])
之前多次提到 loki 和 es 最大的不同是 loki 只对标签进行索引而不对内容索引 下面我们举例来看下。
以简单的 promtail 配置举例
配置解读
scrape_configs:- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: message__path__: /var/log/messages
scrape_configs:- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: syslog__path__: /var/log/syslog- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: apache__path__: /var/log/apache.log
11.11.11.11 - frank [25/Jan/2000:14:00:01 -0500] "GET /1986.js HTTP/1.1" 200 932 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"
在 Promtail 中使用 regex 想要匹配 action 和 status_code 两个标签
scrape_configs:- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: syslog__path__: /var/log/syslog- job_name: systempipeline_stages:static_configs:- targets:- localhostlabels:job: apache__path__: /var/log/apache.log- job_name: systempipeline_stages:- regex:expression: "^(?P<ip>\\S+) (?P<identd>\\S+) (?P<user>\\S+) \\[(?P<timestamp>[\\w:/]+\\s[+\\-]\\d{4})\\] \"(?P<action>\\S+)\\s?(?P<path>\\S+)?\\s?(?P<protocol>\\S+)?\" (?P<status_code>\\d{3}|-) (?P<size>\\d+|-)\\s?\"?(?P<referer>[^\"]*)\"?\\s?\"?(?P<useragent>[^\"]*)?\"?$"- labels:action:status_code:static_configs:- targets:- localhostlabels:job: apacheenv: dev__path__: /var/log/apache.log
11.11.11.11 - frank [25/Jan/2000:14:00:01 -0500] "GET /1986.js HTTP/1.1" 200 932 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"11.11.11.12 - frank [25/Jan/2000:14:00:02 -0500] "POST /1986.js HTTP/1.1" 200 932 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"11.11.11.13 - frank [25/Jan/2000:14:00:03 -0500] "GET /1986.js HTTP/1.1" 400 932 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"11.11.11.14 - frank [25/Jan/2000:14:00:04 -0500] "POST /1986.js HTTP/1.1" 400 932 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 GTB6"
以上边提到的 ip 字段为例 - 使用过滤器表达式查询
{job="apache"} |= "11.11.11.11"