Compiler – Page 4 – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

ダイナミックリンクとスタティックリンク

ビルド: コンパイルとリンクを実行して、ソースコードから実行可能な形式のファイルを作る
コンパイルにはコンパイラ、リンクにはリンカというツールが使われる
リンカへの入力は、オブジェクトファイルとライブラリ

C言語の標準ライブラリはlibc
ライブラリファイルは、/lib, /usr/lib, /lib64, /usr/lib64にある

vagrant@vagrant-ubuntu-trusty-64:~/bitcoin$ ls -lp /lib
total 416
drwxr-xr-x 2 root root 4096 May 14 21:28 apparmor/
lrwxrwxrwx 1 root root 21 May 14 22:21 cpp -> /etc/alternatives/cpp
drwxr-xr-x 4 root root 4096 May 14 21:28 cryptsetup/
drwxr-xr-x 3 root root 4096 May 14 21:28 firmware/
drwxr-xr-x 2 root root 4096 May 14 21:28 hdparm/
drwxr-xr-x 2 root root 4096 May 14 21:27 ifupdown/
drwxr-xr-x 2 root root 4096 May 14 21:27 init/
-rwxr-xr-x 1 root root 71528 Jun 13 2017 klibc-gLiulUM5C1Zpwc25rCxX8UZ6S-s.so
lrwxrwxrwx 1 root root 22 Nov 1 2013 libcryptsetup.so.4 -> libcryptsetup.so.4.5.0
-rw-r–r– 1 root root 145552 Nov 1 2013 libcryptsetup.so.4.5.0
lrwxrwxrwx 1 root root 17 Jan 8 2014 libip4tc.so.0 -> libip4tc.so.0.1.0
-rw-r–r– 1 root root 27392 Jan 8 2014 libip4tc.so.0.1.0
lrwxrwxrwx 1 root root 17 Jan 8 2014 libip6tc.so.0 -> libip6tc.so.0.1.0
-rw-r–r– 1 root root 31520 Jan 8 2014 libip6tc.so.0.1.0
lrwxrwxrwx 1 root root 16 Jan 8 2014 libiptc.so.0 -> libiptc.so.0.0.0
-rw-r–r– 1 root root 5816 Jan 8 2014 libiptc.so.0.0.0
lrwxrwxrwx 1 root root 20 Jan 8 2014 libxtables.so.10 -> libxtables.so.10.0.0
-rw-r–r– 1 root root 47712 Jan 8 2014 libxtables.so.10.0.0
drwxr-xr-x 3 root root 4096 May 14 21:27 lsb/
drwxr-xr-x 2 root root 4096 May 14 21:28 modprobe.d/
drwxr-xr-x 3 root root 4096 May 14 21:28 modules/
drwxr-xr-x 2 root root 4096 May 14 21:28 modules-load.d/
drwxr-xr-x 3 root root 4096 May 14 21:27 plymouth/
drwxr-xr-x 3 root root 4096 May 14 21:29 recovery-mode/
drwxr-xr-x 2 root root 4096 May 14 21:28 resolvconf/
drwxr-xr-x 3 root root 4096 May 14 21:28 systemd/
drwxr-xr-x 15 root root 4096 Mar 22 2014 terminfo/
drwxr-xr-x 4 root root 4096 May 14 22:21 udev/
drwxr-xr-x 2 root root 4096 May 14 21:29 ufw/
drwxr-xr-x 5 root root 12288 May 14 22:21 x86_64-linux-gnu/
drwxr-xr-x 2 root root 4096 May 14 21:28 xtables/

agrant@vagrant-ubuntu-trusty-64:~/bitcoin$ ls -lp /lib64
total 0
lrwxrwxrwx 1 root root 32 Mar 27 2019 ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-2.19.so

ライブラリにはシェアードライブラリ(ダイナミックライブラリ)とスタティックライブラリがある
例えば、拡張子が”.so”はShared Objectの略
ライブラリはオブジェクトファイルをアーカイブした集合体で拡張子は.a

シェアードライブラリは動的に決定される
vagrant@vagrant-ubuntu-trusty-64:~/bitcoin$ ldd /bin/bash
linux-vdso.so.1 => (0x00007ffd3e1ac000)
libtinfo.so.5 => /lib/x86_64-linux-gnu/libtinfo.so.5 (0x00007fc634203000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc633fff000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc633c36000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc63442c000)

ビルドの実態は、コンパイラやリンカのコマンドを順次実行している
makeは主にc言語で記述したプログラムのビルドに使う　MySQLなどではCMake

build.gradle(module:app)

android {
    compileSdkVersion 26
    defaultConfig {
        applicationId "jp.hoge.anew"
        minSdkVersion 15
        targetSdkVersion 26
        resConfigs "ja", "xxhdpi"
        versionCode 1
        versionName "1.0"
        testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
    }
    buildTypes {
        release {
            minifyEnabled false
            proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
        }
        debug {
            ext.enableCrashlytics = false
        }
    }
    aaptOptions {
        cruncherEnabled false
    }
}

少し早くなった印象。

First Set Algorithm

for each nonterminal X do First(X) :={}
while there are changes to any First(X) do
	for each production rule X ->X1X2...Xn do
		k:=1;
		while k<=n do
			First(X) = First(X) U (First(Xk)-{ε})
			if ε is not in First(Xk) then break;
			k := k+1;
		if (k>n) then First(X) = First(X) U {ε}

follow sets

stmt -> if-stmt|other
if-stmt -> if(exp)stmt else-part
else-part -> else stmt | ε
exp -> 0 | 1

Regular Expressions

Regular Expressions are used in grep, sed, awk, perl, vi, shells, lex, yacc
– Each may use slightly different convention

State Machines

Parser Classification
LR: Bottom Up Parser
-L: scan left to right
-R: traces rightmost derivation of input string

LL: Top Down Parser
-L: Scan left to right
-L: traces leftmost derivation of input string

Typical notation
– LL(0), LL(1), LR(1), LR(k)
– number(k) refers to maximum look ahead

Recursive Descent Parsing
Non-terminal variable
convert
correct select rule for expansion matchToken
consumes token or Error

EBNF to BNF
 ::=  '*'  | 
EBNF
 ::=  {'*'  }
BNF
 ::= 
 ::= '*'   | ε

RegEx: Finite Automata

Deterministic Finite Automata (DFA)

Lexical Analysis(Scanner)
Read input: one character at a time
Group characters into tokens
Remove white spaces and comments
Encode token types and form tuples and return to parser

type value
FloatConst 123.45
VAR String=DaysOfWeek
Operator Value =+

Regular expression r (a pattern of characters) and its language L(r)
-> used in many Unix programs (e.g., grep, vi., etc.)
State machine (finite automata)

Representations of String
Basics of Regular Expression
Σ= AaBbCcDdEeFf…. {}|[]”;’<>?,./

Symbols & alphabet
-> A symbol is a valid character in a language
-> alphabet is set of legal symbols

Metacharacters/metasymbols that have special meanings:
Defining reg-ex operations(e.g. |,(,), *, + etc.)
Escape character(\) to turn off special meanings
Empty string e and empty set

a | b* = {a, ε, b, bb, bbb, …}
(a | b)* = {ε, a, b, ab, ba, aa, bb, aaa, aab, …} = Any number (including zero) of a’s and b’s in any order

unix-style regular expression
a-b same as abcd
^1-3 denotes characters other than 1, 2, or 3
-, [], +, ?, ^ and $
[a-zA-Z_][a-zA-Z_0-9]*
[0-9][0-9]*
(+|-)?[0-9][0-9]*

semantic symbols

-> #put-type;#do-decl
-> int | float
-> #add-decl -> #add-decl -> ID#proc-decl #put-type puts given type on semantic stack #proc-decl builds decl record for var on stack #add-decl builds decl-chain p#do-decl traverses chain on semantic stack using backwards pointers entering each var into symbol table

float a, b, c

translation: generate temporary values, propagate them to keep semantic context

Compiler Structure
scanner -> parser -> semantic analysis -> intermdeiate representation -> code generation -> assembly code

semantic actions

Symbol Table

int a, b;
declares a and b, within current scope, type integer

Basic symbol table
name, type, scope
a, int, “main”

semantic action
1. enter variable declaration into symbol table
2. look up variables in symbol table
3. do binding of looked-up variables (scoping rules, etc.)
4. Do type checking for compatibility
5. keep the semantic context of processing
e.g. a + b + c -> t1 = a + b
t2 = t1 + c

How Compilers Work

Start
-parse
-get next token
-semantic analysis
-look up variable name
-do semantic checks -> generate code(passed) or semantic error(failed)
-put in symbol table

Front End: Analysis
1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis

Back End: Systhesis
4. Code Generation
5. Optimization

Lexical rules of language dictate how legal word is formed by concatenating alphabet.

If x==1 then if y==2 print 1 else print 2
stmt -> if expr then stmt else stmt

why compilers?

High-level Language

C/Java Compiler

temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;

Fortran Compiler

TEMP = V(K)
V(K) = V(K+1)
V(K+1) = TEMP

Assembly Language

lw $to, 0($2)
lw $t1, 4($2)
sw $t1, 0($2)
sw $to, 4($2)

#include

int main()
{
	char A = '2';
	int sum = A + 10;
	std::cout<<"sum ="<
		Author  blogPosted on 2017年12月24日2017年12月24日Categories Compiler