[Elastic stack]

Akagi201

2/17/2016 - 10:17 AM

elk.md

Rendered
Source

ELK Books

ELKstack 中文指南

Kinaba devops

清理: DELETE /filebeat-*

Elastic

Elasticsearch cluster -> indices -> types -> documents -> fields

Index

Index (noun): 数据库, 一个 index 实际上是一个逻辑上的命名空间指向一个或多个物理 shards.
Index (verb): 加入数据库
Inverted index: 索引
refs: https://www.elastic.co/guide/en/elasticsearch/guide/current/_indexing_employee_documents.html
In Elasticsearch, all data in every field is indexed by default. That is, every field has a dedicated inverted index for fast retrieval.

Search

query string search: simple, limit, for command line.
query DSL: rich, flexible, query language.

Cluster

Scale

vertical scale/scaling up: bigger servers.
horizontal scale/scaling out: more servers.

Shards

A shard is a low-level worker unit that holds just a slice of all the data in the index.
A shard is a single instance of Lucene, and is a complete search engine in its own right.
Our documents are stored and indexed in shards, but our applications don’t talk to them directly. Instead, they talk to an index.
A shard can be either a primary shard or a replica shard. Each document in your index belongs to a single primary shard, so the number of primary shards that you have determines the maximum amount of data that your index can hold.
A primary shard can technically contain up to Integer.MAX_VALUE - 128 documents
A replica shard is just a copy of a primary shard.
The number of primary shards in an index is fixed at the time that an index is created, but the number of replica shards can be changed at any time.
By default, indices are assigned five primary shards.

Document

document == object
In Elasticsearch, the term document has a specific meaning. It refers to the top-level, or root object that is serialized into JSON and stored in Elasticsearch under a unique ID.
Documents live in an index.

Document Metadata

_index: Where the document lives.
_type: The class of object that the document represents.
_id: The unique identifier for the document.
A document’s _index, _type, and _id uniquely identify the document.
Autogenerated IDs are 20 character long, URL-safe, Base64-encoded GUID strings.
These GUIDs are generated from a modified FlakeID scheme which allows multiple nodes to be generating unique IDs in parallel with essentially zero chance of collision.

Searching

Every field in a document is indexed and can be queried.

Search API

lite query-string
query DSL

Inverted index

Elasticsearch uses a structure called an inverted index, which is designed to allow very fast full-text searches.
An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears.
把要全文搜索的 field, 拆分成一些独立的 word, 叫做 terms / tokens. 创建一个排序的唯一的 term 列表. 然后显示他们在哪个 document 里面.

Analysis and Analyzer

The process of tokenization and normalization.
An analyzer is just a wrapper of: Character filters, Tokenizer, Token filters.

full-text field vs exact-value field

Fields of type string are, by default, considered to contain full text.

Mapping

Each document in an index has a type. Every type has its own mapping, or schema definition.
A mapping defines the fields within a type, the datatype for each field, and how the field should be handled by Elasticsearch.
A mapping is also used to configure metadata associated with the type.

Cacher is the code snippet organizer for pro developers

We empower you and your team to get more done, faster

[Elastic stack]

ELK Books

Kinaba devops

Elastic

Index

Search

Cluster

Scale

Shards

Document

Document Metadata

Searching

Search API

Inverted index

Analysis and Analyzer

full-text field vs exact-value field

Mapping