Elastic searchをインストール

[vagrant@localhost ~]$ rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[vagrant@localhost ~]$ cd /etc/yum.repos.d/
[vagrant@localhost yum.repos.d]$ sudo touch elasticsearch.repo
[vagrant@localhost yum.repos.d]$ sudo vi elasticsearch.repo

[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

[vagrant@localhost yum.repos.d]$ sudo yum install elasticsearch

### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using chkconfig
sudo chkconfig –add elasticsearch
### You can start elasticsearch service by executing
sudo service elasticsearch start
Created elasticsearch keystore in /etc/elasticsearch
Verifying : elasticsearch-7.0.0-1.x86_64 1/1

インストール:
elasticsearch.x86_64 0:7.0.0-1

完了しました!

[vagrant@localhost yum.repos.d]$ sudo chkconfig –add elasticsearch
[vagrant@localhost yum.repos.d]$ sudo service elasticsearch start
elasticsearch を起動中: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5330000, 986513408, 0) failed; error=’Not enough space’ (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 986513408 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /var/log/elasticsearch/hs_err_pid5271.log
[失敗]
なにいいいいいいいいいいいいいいいいいいいいいいいいいい
最近メモリが足りないってエラーが多すぎるんだが。。

[vagrant@localhost ~]$ sudo cat /etc/elasticsearch/jvm.options
## JVM configuration

################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g

################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################

## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly

## G1GC Configuration
# NOTE: G1GC is only supported on JDK version 10 or later.
# To use G1GC uncomment the lines below.
# 10-:-XX:-UseConcMarkSweepGC
# 10-:-XX:-UseCMSInitiatingOccupancyOnly
# 10-:-XX:+UseG1GC
# 10-:-XX:InitiatingHeapOccupancyPercent=75

## DNS cache policy
# cache ttl in seconds for positive DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.ttl; set to -1 to cache forever
-Des.networkaddress.cache.ttl=60
# cache ttl in seconds for negative DNS lookups noting that this overrides the
# JDK security property networkaddress.cache.negative ttl; set to -1 to cache
# forever
-Des.networkaddress.cache.negative.ttl=10

## optimizations

# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch

## basic

# explicitly set the stack size
-Xss1m

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
-Djna.nosys=true

# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow

# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0

# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true

-Djava.io.tmpdir=${ES_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=/var/lib/elasticsearch

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log

## JDK 8 GC logging

8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/var/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT

-Xms1g、-Xmx1gの箇所ね。
VMのメモリを変えたいところだ。

デフォルトで2Gってどういうことだよ。。

とりあえず、500mに変える。
[vagrant@localhost ~]$ sudo vi /etc/elasticsearch/jvm.options
[vagrant@localhost ~]$ sudo service elasticsearch start
elasticsearch を起動中: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[ OK ]

Elasticsearchとは

Highly scalable full-text search engine developed by Elastic
Enables various analysis such as real time data analysis, log analysis, full text search
Often used with log aggregation Logstash and fluentd, and visualization tools kibana
Searching across multiple databases is provided as a common use

Elasticsearch
– Strong search performance and scalability
– Indexes are often separated by time intervals
– Throw away indexes that are no longer needed
– Store data for analysis and search

RDS:Elasticsearch
DB -> index
table -> mapping type
column -> field
record -> document

Mapping: Setting of field type and analysis method
Analysis: About processing of field values such as language processing and normalization
Query DSL: Assembling search condition in JSON format

sample

{code}
echo -e ‘logstash\nfluentd\nflume’ | bin/logstash -e ‘input { stdin {} } output { stdout {codec => rubydebug}}’
{
“message” => ‘logstash’,
“@version” => “1”,
“@timestamp” => “2015-01-17T16:18:46.175Z”,
“host” => “hope”
}
{
“message” => “fluentd”,
“@version” => “1”,
“@timestamp” => “2015-01-17T16:18:46.175Z”,
“host” => “hope”
}
{
“message” => “flume”,
“@version” => “1”,
“@timestamp” => “2015-01-17T16:18:46.175Z”,
“host” => “hope”
}
{/code}

{code}
input {
stdin{}
}
filter {
mutate {
replace => {message => “%{message} こんにちは!”}
}
}
output {
stdout {
codec => rubydebug
}
}
{/code}

edit /etc/yum.repos.d directory

yum is an integrated management system for packages.

It manages RPM packages and is more convenient and easier to use than using the rpm command. yum manages and integrates RPM information and resolves dependencies automatically. It is in the same standing position as APT in Debian. By using yum, can update package of distribution, search for package, delete package, display package information, etc.

For the repository, a separate file is prepared and describe in “/etc/yum.repos.d”.

[vagrant@localhost ~]$ cd /etc/yum.repos.d
[vagrant@localhost yum.repos.d]$ ls
CentOS-Base.repo mariadb.repo remi-php54.repo
CentOS-Debuginfo.repo mysql-community-source.repo remi-php70.repo
CentOS-Media.repo mysql-community.repo remi-php71.repo
CentOS-Vault.repo nginx.repo remi-php72.repo
CentOS-fasttrack.repo nodesource-el.repo remi-php73.repo
epel-testing.repo remi-glpi91.repo remi-safe.repo
epel.repo remi-glpi92.repo remi.repo
jenkins.repo remi-glpi93.repo
[vagrant@localhost yum.repos.d]$ sudo touch logstash.repo

公式と同じように書きます。
[vagrant@localhost yum.repos.d]$ sudo vi logstash.repo
[vagrant@localhost yum.repos.d]$ cat logstash.repo
[logstash-5.x]
name=Elastic repository for 5.x packages
baseurl=https://artifacts.elastic.co/packages/5.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

[vagrant@localhost ~]$ sudo yum install logstash
読み込んだプラグイン:fastestmirror
インストール処理の設定をしています
Loading mirror speeds from cached hostfile
* base: ftp.nara.wide.ad.jp
* extras: ftp.nara.wide.ad.jp
* remi-safe: ftp.riken.jp
* updates: ftp.nara.wide.ad.jp
https://artifacts.elastic.co/packages/5.x/yum/repodata/repomd.xml: [Errno 14] PYCURL ERROR 6 – “Couldn’t resolve host ‘artifacts.elastic.co'”
他のミラーを試します。
エラー: Cannot retrieve repository metadata (repomd.xml) for repository: logstash-5.x. Please verify its path and try again

なにいいいいいいいいいいいいいいいいいいいいい
artifacts.elastic.coが違うだと。。。

REDHATの公式を見る
>Satellite または Proxy サーバーに完全修飾ドメイン名 (FQDN) が設定されており、Apache が使用する SSL 証明書の CommonName (CN) が FQDN に設定されていることを確認してください。

https://access.redhat.com/ja/solutions/1307833

[vagrant@localhost yum.repos.d]$ grep CN /etc/httpd/conf/ssl.crt/server.crt
grep: /etc/httpd/conf/ssl.crt/server.crt: そのようなファイルやディレクトリはありません
[vagrant@localhost yum.repos.d]$ grep ^SSLCert /etc/httpd/conf.d/ssl.conf
SSLCertificateFile /etc/pki/tls/certs/localhost.crt
SSLCertificateKeyFile /etc/pki/tls/private/localhost.key

何言いいいいいいいいいいいいいいいい、わからん。
あかん、とりあえずelasticsearchに行こう。

じゃんがらたべたら

あ、ちゃんと書いてあるやんけ
installing-logstash.html
Add the following in your /etc/yum.repos.d/ directory in a file with a .repo suffix, for example logstash.repo

[logstash-5.x]
name=Elastic repository for 5.x packages
baseurl=https://artifacts.elastic.co/packages/5.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md

あれ、/etc/yum.repos.d/ って、そもそも何だっけ?

拡張子が.repoですな。centOS, eple, mariadb, mysql-cocommunity, nginxなどありますね。

試しに、jenkins.repoを見てみましょう。

[jenkins]
name=Jenkins
baseurl=http://pkg.jenkins.io/redhat
gpgcheck=1

なるほどー baseurlを指定して、ここからインストールしてるのかな。
yum instrallとRPMについて、少し理解が深まりました^^

set up logstash

Logstashはjava8のインストールが必要らしい

[vagrant@localhost ~]$ java -version
openjdk version “1.8.0_191”
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
OK

[vagrant@localhost ~]$ sudo rpm –import https://artifacts.elastic.co/GPG-KEY-elasticsearch
[vagrant@localhost ~]$ sudo yum install logstash
読み込んだプラグイン:fastestmirror
インストール処理の設定をしています
Determining fastest mirrors
* base: ftp.nara.wide.ad.jp
* extras: ftp.nara.wide.ad.jp
* remi-safe: ftp.riken.jp
* updates: ftp.nara.wide.ad.jp
base | 3.7 kB 00:00
extras | 3.4 kB 00:00
jenkins | 2.9 kB 00:00
jenkins/primary_db | 127 kB 00:01
mariadb | 2.9 kB 00:00
mysql-connectors-community | 2.5 kB 00:00
mysql-connectors-community/primary_db | 36 kB 00:00
mysql-tools-community | 2.5 kB 00:00
mysql-tools-community/primary_db | 49 kB 00:00
mysql56-community | 2.5 kB 00:00
mysql56-community/primary_db | 261 kB 00:00
nginx | 2.9 kB 00:00
nginx/primary_db | 49 kB 00:00
nodesource | 2.5 kB 00:00
remi-safe | 3.0 kB 00:00
remi-safe/primary_db | 1.2 MB 00:00
updates | 3.4 kB 00:00
updates/primary_db | 3.7 MB 00:04
パッケージ logstash は利用できません。
エラー: 何もしません

なにいいいいいいいいいいいいいいいいいいい
なぜだああああああああああああああああああああああああああああああああ
もうヤダ。気分転換に日本橋のアンテナショップ行こ。

Logstashを学ぼう

Input
While data is distributed across many systems in different formats, Logstash is a variety of input plug-ins that capture events from different sources simultaneously. You can easily, continuously and smoothly implement data injection from logs, metrics, webapplications, data stores, or various cloud services.

Filter ここは重要か?
Data analysis and transformation
As data is transferred from source to store, Logstash’s filter parses each event and identifies and structures the fields. It further transforms the data into a common format that is best for analysis that delivers business value.

– Extract structures from unstructured data using grok filters
– Get geographical information from IP address
– Anonymize personal information and completely exclude confidential information fields
– Easy processs data from any source, format or schema.
あ、なんかこれは凄そうだ。。

Output
Choose a storage location, and transfer data.
The ideal destination is Elasticsearch, but other destinations are of course also available. The search and analysis possibilities are not impeded.

Logstash allows you to specify various output destinations and transfer data freely. This high degree of flexibility allows you to work with many downstreams.

うん、サンプルをいじりたくなってきました。OKOK!

NetFlow

NetFlow is a technology developed by Cisco Systems, Inc for monitoring and analyzing network traffic information. Implemented primarily in Cisco routers and switches, it is now becoming the industry standard in flow measurement and is now supported by many vendors’ network devices. Analyze flow information such as NetFlow to identify operational or security issues, strengthen external or internal network security.

What is flow in network traffic analysis is like a packet group with common attributes flowing on the network. For example, attributes such as source / destination IP address, source / destination port number, protocol number.
If common, the packet is considered as the same flow. In an easy-to-understand example, if a user uploads a file to the server, the processing in that case is regarded as one flow(in terms of packets, it is a block of multiple packets wit common attributes). By analyzing this flow information, it is possible to monitor and analyze traffic on a per-suer or per-application basis.

JDBC

JDBCって何? MySQL接続する為のドライバだった記憶があるが。。

JDBC is, in word, “a standard Java API for accessing relational dattabases (and almost any tabular data). JDBC is said to be short for “Java Database Connectivity” but it is not actually specified in the JDBC specification.

Critical data in an enterprise is often stored in a relational database. As such, JDBC is one of the key APIs underlying Java-based enterprise applications.

Portable database applications can be built by using JDBC drives that absorb differences between databases and JDBC APIs that are standard APIs that do not depend on specific vendors. Not only for the platform of the execution environment, but also for the connected database, it is possible to realize WORA(Write Once, Run Anywhere), which is one of the outstanding features of Java, at a higher level.

JDBC can be used from various Java components such as:
– Regular Java classes and JavaBeans
– Java application that runs on the client
– Java Applet that runs on a web client(web browser)
– Servlets and JSPs that run on a web container(J2EE server)
– Session Bean or Entity Bean that runs on EJB container(J2EE server)

ああああああああ、Javaでアプリケーション作らないといけない、という課題が露呈してしまった。あかん。

ELK(Elasticsearch, Kibana, Logstash)

ELKって何?持田香織? それELT😭
ELKはElasticsearch、 Logstash、Kibanaの頭文字で、ELK

あれ、ElasticsearchはConsoleにあるけど、LogstashはConsoleにないぞ。どういうことだ??

あら、LogstashはAWSのサービスではないのね。elasticというサービスの中の一つだ。む、これはなんか、機械学習と近い領域か。。

こちの図が関係性をよく表しています。
https://www.elastic.co/jp/products/logstash

わかったけど、ちょっと待て。Apacheは普通にわかるけど、JDBC、Netflowって何?