1. terms聚合(group by+sum)
使用场景
获取数据中某个字段的值列表,例如nginx访问日志中clientip列表,通过内置排序取前10即为网站TOP10访客。
query DSL
{
"size": 0,
"aggs": { [1]
"topIP": { [2]
"terms": { [3]
"field": "clientip",
"size": 10,
"order": {
"_count": "desc"
}
}
}
}
}
-1- 聚合操作被置于顶层参数 aggs 之下(如果你愿意,完整形式 aggregations 同样有效)
-2- 然后,可以为聚合指定一个我们想要名称,本例中是: topIP
-3- 最后,定义单个桶的类型 terms
response
"aggregations": {
"topIP": {
"buckets": [
{
"key": "123.59.215.53",
"doc_count": 961868
},
{
"key": "115.231.24.219",
"doc_count": 804120
},
{
"key": "58.215.139.89",
"doc_count": 439965
},
……
]
}
}
可视化

饼图

条形图
2. date_histogram聚合
使用场景
在时间
维度上构建指标分析,如:
- 网站今天每小时的访问量是多少?
- 网站今天每小时的平均响应时间是多少?
query DSL
1 网站今天每小时的访问量是多少?
{
"size": 0,
"aggs": {
"page_view": {
"date_histogram": {
"field": "@timestamp",
"interval": "1h",
"time_zone": "Asia/Shanghai",
"format": "yyyy-MM-dd HH:mm",
"min_doc_count": 1,
"extended_bounds": {
"min": 1493827200000,
"max": 1493913599999
}
}
}
}
}
2 网站今天每小时的访问量和平均响应时间是多少?
{
"size": 0,
"aggs": {
"page_view": {
"date_histogram": {
"field": "@timestamp",
"interval": "1h",
"time_zone": "Asia/Shanghai",
"format": "yyyy-MM-dd HH:mm",
"min_doc_count": 1,
"extended_bounds": {
"min": 1493827200000,
"max": 1493913599999
}
},
"aggs": {
"avg_resp_time": {
"avg": {
"field": "upstream_response_time"
}
}
}
}
}
}
response
"aggregations": {
"page_view": {
"buckets": [
{
"avg_resp_time": {
"value": 0.008664544491125232
},
"key_as_string": "2017-05-04 09:00",
"key": 1493859600000,
"doc_count": 321402
},
{
"avg_resp_time": {
"value": 0.015245752238360864
},
"key_as_string": "2017-05-04 10:00",
"key": 1493863200000,
"doc_count": 456973
},
{
"avg_resp_time": {
"value": 0.01839558196852533
},
"key_as_string": "2017-05-04 11:00",
"key": 1493866800000,
"doc_count": 754249
},
{
"avg_resp_time": {
"value": 0.11747828058831987
},
"key_as_string": "2017-05-04 12:00",
"key": 1493870400000,
"doc_count": 530589
},
{
"key_as_string": "2017-05-04 13:00",
"key": 1493874000000,
"doc_count": 64519
}
]
}
}
可视化


3. histogram聚合
使用场景
针对数值型指标,通过设定间隔大小,快速绘制条形图。
query DSL
{
"size": 0,
"aggs": {
"page_view": {
"histogram": {
"field": "bytes",
"interval": 1024
}
}
}
}
response
"aggregations": {
"topIP": {
"buckets": [
{
"key": 0,
"doc_count": 199002
},
{
"key": 1024,
"doc_count": 296
},
{
"key": 2048,
"doc_count": 1754
},
{
"key": 3072,
"doc_count": 13
},
{
"key": 4096,
"doc_count": 15
},
{
"key": 5120,
"doc_count": 131
},
……
]
}
}
可视化

4. cardinality聚合(unique count)
使用场景
统计去重后的数量,它提供一个字段的基数,即该字段的 distinct 或者 unique 值的数目。 SQL 形式为:
SELECT COUNT(DISTINCT color) FROM cars
query DSL
{
"size": 0,
"aggs": {
"2": {
"date_histogram": {
"field": "@timestamp",
"interval": "30s",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1,
"extended_bounds": {
"min": 1493881370817,
"max": 1493882270817
}
},
"aggs": {
"1": {
"cardinality": {
"field": "uri_path"
}
}
}
}
}
}
可视化
