Starting and ending regular expression

I want to issue an alert when only a dot in a line.
It needs check for starting and ending regular expression.

Is it a character string starting with 〇〇?

preg_match("/^xx/", $char);

Is it a character string ending with 〇〇?

preg_match("/xx$/", $char);

Here is the main theme. When judging whether it is a character string of only dots, combine the above.

$char = '.';

if(preg_match('/^.$/', $char)){
	echo "please write again.";
} else {
	echo "OK";
}

-> please write again.

Yes, it is as expected.
I would like to apply next. Read the file and if there is a line with only dots, it will issue an error.

this is a popular Hamlet.
test.txt

To be, or not to be: that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles,
And by opposing end them? To die: to sleep;
No more; and by a sleep to say we end
The heart-ache and the thousand natural shocks
That flesh is heir to, 'tis a consummation
Devoutly to be wish'd. To die, to sleep;

app.php

$file = fopen("text.txt", "r");

if($file){
	while($line = fgets($file)){

		if(preg_match('/^.$/', $char)){
			echo "please write again.<br>";
		} else {
			echo "OK<br>";
		}

	}
}


It is looks like it is going well😂

Count the number of characters, bytes, width in PHP

How to count the number of characters, number of bytes, character width (apparent length) of a character string in PHP?

1. strlen
function strlen
Generally, to know the number of bytes, use strlen.

mb_internal_encoding('UTF-8');
$char = 'To die: to sleep No more; and by a sleep to say we end. The heart-ache and the thousand natural shocks. That flesh is heir to, tis a consummation Devoutly to be wishd. ';
echo strlen($char);

168

2. mb_strlen
function mb_strlen
Also, use mb_strlen to distinguish between full-width and half-width characters and count the number of characters.

mb_internal_encoding('UTF-8');
$char = '名前(カタカナ)';

echo mb_strlen($char);

3. mb_strwidth
function mb_strwidth
Use mb_strwidth to count character width(apparent length).

mb_internal_encoding('UTF-8');
$char = 'おはようございます。';

echo mb_strwidth($char);

Let’s put emoji and count it.

mb_internal_encoding('UTF-8');
$char = 'こんにちは😀';

echo strlen($char);

-> 19
it is as expected.

Then, let’s display an alert if it is more than 10 bytes.

mb_internal_encoding('UTF-8');
$char = 'こんにちは😀';

// echo strlen($char);
if(strlen($char) > 10){
	echo "this is more than 10 byte, please write again";
} else {
	echo "confirmed.";
}

-> this is more than 10 byte, please write again

Perfect job, am I.

4 byte character checking for PHP

Some of the pictograms and Chinese characters are 4bytes of UTF-8 characters, and cannot be saved depending on the version of mysql and character code. I think there is something I want to check that there are more characters than the UTF-8 range in the string.

let’s look at sample, cherry blossom is 4 byte characters.

mb_internal_encoding('UTF-8');
$char = 'abcdefgあいうえお🌸';

echo preg_replace('/[\xF0-\xF7][\x80-\xBF][\x80-\xBF][\x80-\xBF]/', '', $char);

abcdefgあいうえお
Wow, it’s impressive.

Ok then, I want to display an alert if 4-byte characters are included.
Let’s see regular expression below.

mb_internal_encoding('UTF-8');
$char = 'abcdefgあいうえお';
// $char = 'abcdefgあいうえお🌸';

// echo preg_replace('/[\xF0-\xF7][\x80-\xBF][\x80-\xBF][\x80-\xBF]/', '', $char);

if(preg_match('/[\xF0-\xF7][\x80-\xBF][\x80-\xBF][\x80-\xBF]/', $char)){
	echo "alert, it included 4 byte";
} else {
	echo "there is no 4 byte charcter";
}

Oh my goodness.
“just trust yourself. surely, the way to live is coming. Goethe”

String Byte conversion for PHP

Let’s look up hex2bin function in php manual.
hex2bin

It says “Decodes hexadecimally encoded binary string”.

Example 1

mb_internal_encoding('UTF-8');
$char = 'あ';

echo bin2hex($char);

result: e38182

This is the unicode standard manual
Unicode.org
UTF8 has 1byte to 4 byte characters.

What happen about in the case of Emoji?
🌸 Cherry blossom: 1F338

mb_internal_encoding('UTF-8');
$char = '🌸';

echo bin2hex($char);

f09f8cb8
A number different from unicode was displayed.
Comversion seems to be done without problems.

Character Code as mainly used in Japan

Character Code
It refers to the correspondence between the characters used on the computer and the numbers in bytes assigned to each letter. Character codes have some to be used in many language spheres by computers, and the variety has increased. Typical character codes are said to be more than 100 kinds.

Mainly used in Japan
JIS Code
The official name is “ISO-2022-JP”. It is widely used.

SJIS(Shift-JIS) Code
It is ASCII code plus Japanese, and it is used in Japan domestic mobile phones.

EUC
It is widely used on UNIX.


Unicode consists of “encoded character set” and “character encoding method(encoding)”.

“Character set” refers to letters put together by certain rules such as “All Hiragana” and “All Alphabet”, for example. A rule in which a unique code is associated with the character set is called a “coded character set”. The associated numerical value is called a code point and it is displayed in the form of “U + xxxxx”.

“Character encoding method” is a method of converting a coded character set into another byte sequence so that it can be handled by a computer. The encoding methods include UTF-8 and UTF-16.

Unicode code point
0x0000 – 0x007f : ASCII
0x0080 – 0x07ff : Country alphabet
0x0800 – 0xffff : Indian characters, punctuation marks, academic symbols, pictograms, East Asian characters, double-byte, half-size

OK!

Linux grep option -e

What is linux grep option -e

The -e option is used to search for “or”.
Let’s look at example below.

command

sudo cat cron | grep 'Jan 12 11:0[0-9]' | grep -e run-parts -e anacron
[vagrant@localhost log]$ sudo cat cron | grep 'Jan 12 11:0[0-9]' | grep -e run-parts -e anacron
Jan 12 11:01:01 localhost CROND[8441]: (root) CMD (run-parts /etc/cron.hourly)
Jan 12 11:01:01 localhost run-parts(/etc/cron.hourly)[8441]: starting 0anacron
Jan 12 11:01:01 localhost anacron[8450]: Anacron started on 2019-01-12
Jan 12 11:01:01 localhost run-parts(/etc/cron.hourly)[8452]: finished 0anacron
Jan 12 11:01:01 localhost anacron[8450]: Will run job `cron.daily' in 7 min.
Jan 12 11:01:01 localhost anacron[8450]: Jobs will be executed sequentially
Jan 12 11:08:01 localhost anacron[8450]: Job `cron.daily' started
Jan 12 11:08:01 localhost run-parts(/etc/cron.daily)[8453]: starting logrotate
Jan 12 11:08:01 localhost run-parts(/etc/cron.daily)[8461]: finished logrotate
Jan 12 11:08:01 localhost run-parts(/etc/cron.daily)[8453]: starting makewhatis.cron
Jan 12 11:08:03 localhost run-parts(/etc/cron.daily)[8585]: finished makewhatis.cron
Jan 12 11:08:03 localhost anacron[8450]: Job `cron.daily' terminated
Jan 12 11:08:03 localhost anacron[8450]: Normal exit (1 job run)

PHP google translate API

How to use PHP Google translation?

mkdir a directory and clone the repo

[vagrant@localhost translation]$ git clone https://github.com/GoogleCloudPlatform/php-docs-samples
Initialized empty Git repository in /home/vagrant/app/translation/php-docs-samples/.git/
remote: Enumerating objects: 79, done.
remote: Counting objects: 100% (79/79), done.
remote: Compressing objects: 100% (65/65), done.
remote: Total 10046 (delta 27), reused 35 (delta 9), pack-reused 9967
Receiving objects: 100% (10046/10046), 10.07 MiB | 530 KiB/s, done.
Resolving deltas: 100% (6246/6246), done.

[vagrant@localhost php-docs-samples]$ ls
CONTRIBUTING.md bigtable error_reporting logging trace
LICENSE cloud_sql favicon.ico monitoring translate
README.md compute firestore pubsub video
appengine datastore iap spanner vision
asset debugger iot speech
auth dialogflow jobs storage
bigquery dlp kms testing
bigquerydatatransfer endpoints language texttospeech

translate.php

/**
 * Copyright 2016 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
require __DIR__ . '/vendor/autoload.php';

use Symfony\Component\Console\Application;
use Symfony\Component\Console\Command\Command;
use Symfony\Component\Console\Input\InputArgument;
use Symfony\Component\Console\Input\InputOption;

$application = new Application('Google Cloud Translate API');

// Add Detect Language command
$application->add(new Command('detect-language'))
    ->setDescription('Detect which language text was written in using Google Cloud Translate API')
    ->addArgument('text', InputArgument::REQUIRED, 'The text to examine.')
    ->setHelp(<<<EOF
The <info>%command.name%</info> command detects which language text was written in using the Google Cloud Translate API.

    <info>php %command.full_name% "Your text here"</info>

EOF
    )
    ->setCode(function ($input, $output) {
        $text = $input->getArgument('text');
        require __DIR__ . '/src/detect_language.php';
    });

// Add List Codes command
$application->add(new Command('list-codes'))
    ->setDescription('List all the language codes in the Google Cloud Translate API')
    ->setHelp(<<<EOF
The <info>%command.name%</info> command lists all the language codes in the Google Cloud Translate API.

    <info>php %command.full_name%</info>

EOF
    )
    ->setCode(function ($input, $output) {
        require __DIR__ . '/src/list_codes.php';
    });

// Add List Languages command
$application->add(new Command('list-langs'))
    ->setDescription('List language codes and names in the Google Cloud Translate API')
    ->addOption('target-language', 't', InputOption::VALUE_REQUIRED,
        'The ISO 639-1 code of language to use when printing names, eg. \'en\'.')
    ->setHelp(<<<EOF
The <info>%command.name%</info> lists language codes and names in the Google Cloud Translate API.

    <info>php %command.full_name% -t en</info>

EOF
    )
    ->setCode(function ($input, $output) {
        $targetLanguage = $input->getOption('target-language');
        require __DIR__ . '/src/list_languages.php';
    });

// Add Translate command
$application->add(new Command('translate'))
    ->setDescription('Translate text using Google Cloud Translate API')
    ->addArgument('text', InputArgument::REQUIRED, 'The text to translate.')
    ->addOption('model', null, InputOption::VALUE_REQUIRED, 'The model to use, "base" for standard and "nmt" for premium.')
    ->addOption('target-language', 't', InputOption::VALUE_REQUIRED,
        'The ISO 639-1 code of language to use when printing names, eg. \'en\'.')
    ->setHelp(<<<EOF
The <info>%command.name%</info> command transcribes audio using the Google Cloud Translate API.

    <info>php %command.full_name% -t ja "Hello World."</info>

EOF
    )
    ->setCode(function ($input, $output) {
        $text = $input->getArgument('text');
        $targetLanguage = $input->getOption('target-language');
        $model = $input->getOption('model');
        if ($model) {
            require __DIR__ . '/src/translate_with_model.php';
        } else {
            require __DIR__ . '/src/translate.php';
        }
    });

// for testing
if (getenv('PHPUNIT_TESTS') === '1') {
    return $application;
}

$application->run();

Google Translate API Pricing

This is a Google Cloud Translating API page
Cloud Translation API document

I would like to Transtaion API for slack.

function that to make
– when user write Japanese in slack, slack bot change sentence into English.

Official document
0-1 billion characters(1,000,000,000)
Translation $20 per 1,000,000 characters
Language Detection $20 per 1,000,000 characters

What is language detection?
-> Determine what language is.

What should I do?
-> Check PHP sample for google translation API.

Here is GHE repository and get code.
https://github.com/GoogleCloudPlatform/php-docs-samples/tree/master/translate

Translation API is convenient, but it feel like somewhat expensive.

エンジニアと英語の表現力

エンジニアにとって、英語力は必須。
これは間違いないだろう。

英語でも、リスニング、リーディング、ライティングがある。特に重要なのはreadingかと思っていたが、どうやら考えが甘すぎた。。
– listening
– reading
– writing

本当に重要なのは、英語の表現力。
高いレベルの仕事をするには、説得力のある説明や、よりスムーズに仕事をできるような英語力が必須。
とはいえ、これは直ぐには身につかないし、何をやるのがベストか?

案1:英語でブログを書く
案2:英語のサービスを作る
案3:海外に移住する

どうするか??
英語が重要なのは絶対に間違いない。
案1を1年~2年くらい目標にやってみるか。。