azure phpで翻訳してonclickで喋らせる

text apiから返ってきたjsonをdocument.getElementById(“text1”).innerHTMLで取得して、speechSynthesisを使います。

<?php

$key = 'hoge';

$host = "https://api.cognitive.microsofttranslator.com";
$path = "/translate?api-version=3.0";
$params = "&to=en";
$text = "当社は中長期的かつ持続的な企業価値の向上を目指しており、そのためには、将来の成長を見据えた
サービスへの先行投資や設備投資、資本業務提携を積極的に行うことが重要だと認識しています。同時
に、利益還元を通じて株主の皆さまに報いることが上場会社としての責務と捉えています。
上記方針のもと、当期の期末配当金につきましては、1 株当たり8.86 円(配当金総額は504 億円)と
いたしました。
当社はこれからも、将来の成長のための投資を継続しながら、株主の皆さまへの適切な利益還元を行
うことにより、企業価値の向上を目指していきます。";

if(!function_exists('com_create_guid')){
	function com_create_guid(){
		return sprintf('%04x%04x-%04x-%04x-%04x-%04x%04x%04x',
		mt_rand(0, 0xffff), mt_rand(0, 0xffff),
		mt_rand(0, 0xffff),
		mt_rand(0, 0x0fff) | 0x4000,
		mt_rand(0, 0x3fff) | 0x8000,
		mt_rand(0, 0xffff), mt_rand(0, 0xffff), mt_rand(0, 0xffff));
	}
}

function Translate ($host, $path, $key, $params, $content) {
    $headers = "Content-type: application/json\r\n" .
        "Content-length: " . strlen($content) . "\r\n" .
        "Ocp-Apim-Subscription-Key: $key\r\n" .
        "X-ClientTraceId: " . com_create_guid() . "\r\n";
    $options = array (
        'http' => array (
            'header' => $headers,
            'method' => 'POST',
            'content' => $content
        )
    );
    $context  = stream_context_create($options);
    $result = file_get_contents ($host . $path . $params, false, $context);
    return $result;
}

$requestBody = array (
    array (
        'Text' => $text,
    ),
);
$content = json_encode($requestBody);
$result = Translate($host, $path, $key, $params, $content);

$json = json_decode($result);
$newtext =  $json[0]->translations[0]->text;
?>
<div id="text1">
    <?php echo $newtext; ?>
</div>
<br>
<button id="btn">speach</button>
<script>
    document.querySelector('#btn').onclick = function(){
        var msg = new SpeechSynthesisUtterance();
        msg.volume = 1;
        msg.rate = 1;
        msg.pitch = 2;
        msg.lang = "en-US";
        msg.text = document.getElementById("text1").innerHTML;

        speechSynthesis.speak(msg);
    }
</script>

こいつはすげー

$params = “&to=zh”; にすると、
————-
我们的目标是在中长期内提高公司的价值, 并 我们认识到, 积极参与服务、资本投资和商业联盟的前期投资是重要的。 相同 , 作为上市公司, 我们有责任通过利润回报回报股东。 在上述政策下, 本期年终股息为每股8.86 日元 (总股息为504亿日元) 我们。 我们将继续投资于未来的增长, 并将适当的利润返还给我们的股东。 我们的目标是提高企业价值。
————-
ざっとみたところ、良さそう。
これを日銀のStatements on Monetary Policyでやりたいと思います。

決算ir情報をphpで翻訳する

yahoo 譲渡制限付株式報酬としての新株式の発行に関するお知らせ(259文字)
当社は中長期的かつ持続的な企業価値の向上を目指しており、そのためには、将来の成長を見据えた
サービスへの先行投資や設備投資、資本業務提携を積極的に行うことが重要だと認識しています。同時
に、利益還元を通じて株主の皆さまに報いることが上場会社としての責務と捉えています。
上記方針のもと、当期の期末配当金につきましては、1 株当たり8.86 円(配当金総額は504 億円)と
いたしました。
当社はこれからも、将来の成長のための投資を継続しながら、株主の皆さまへの適切な利益還元を行
うことにより、企業価値の向上を目指していきます。

$key = 'hoge';

$host = "https://api.cognitive.microsofttranslator.com";
$path = "/translate?api-version=3.0";
$params = "&to=en";
$text = "当社は中長期的かつ持続的な企業価値の向上を目指しており、そのためには、将来の成長を見据えた
サービスへの先行投資や設備投資、資本業務提携を積極的に行うことが重要だと認識しています。同時
に、利益還元を通じて株主の皆さまに報いることが上場会社としての責務と捉えています。
上記方針のもと、当期の期末配当金につきましては、1 株当たり8.86 円(配当金総額は504 億円)と
いたしました。
当社はこれからも、将来の成長のための投資を継続しながら、株主の皆さまへの適切な利益還元を行
うことにより、企業価値の向上を目指していきます。";

if(!function_exists('com_create_guid')){
	function com_create_guid(){
		return sprintf('%04x%04x-%04x-%04x-%04x-%04x%04x%04x',
		mt_rand(0, 0xffff), mt_rand(0, 0xffff),
		mt_rand(0, 0xffff),
		mt_rand(0, 0x0fff) | 0x4000,
		mt_rand(0, 0x3fff) | 0x8000,
		mt_rand(0, 0xffff), mt_rand(0, 0xffff), mt_rand(0, 0xffff));
	}
}

function Translate ($host, $path, $key, $params, $content) {
    $headers = "Content-type: application/json\r\n" .
        "Content-length: " . strlen($content) . "\r\n" .
        "Ocp-Apim-Subscription-Key: $key\r\n" .
        "X-ClientTraceId: " . com_create_guid() . "\r\n";
    $options = array (
        'http' => array (
            'header' => $headers,
            'method' => 'POST',
            'content' => $content
        )
    );
    $context  = stream_context_create($options);
    $result = file_get_contents ($host . $path . $params, false, $context);
    return $result;
}

$requestBody = array (
    array (
        'Text' => $text,
    ),
);
$content = json_encode($requestBody);
$result = Translate($host, $path, $key, $params, $content);

$json = json_decode($result);
echo $json[0]->translations[0]->text;

echoの結果->565文字
—–
We aim to improve our corporate value in the medium to long term and We recognize that it is important to actively engage in upfront investment in services, capital investment, and business alliances. Same , it is our duty as a listed company to repay our shareholders through the return of profits. Under the above policy, the year-end dividend for the current period is 8.86 yen per share (total dividend of 50.4 billion yen) We. We will continue to invest in future growth and to return appropriate profits to our shareholders. We aim to improve corporate value.

悪くない!というか、金融系に強い帰国子女に頼むよりも質が高く、早くて安い(笑)
これは自動化したい。

月200万文字まで無料なので、1000文字の翻訳だとすると、上限2000回か。
全銘柄のEdinet IR自動翻訳だとちょっとキツイな。
うーん、どうする?

MySQL 中国語も”varchar”でOK?

mysql> create table **.**(
    ->     id int,
    ->     word1 varchar(255),
    ->     word2 varchar(255),
    ->     word3 varchar(255),
    ->     word4 varchar(255),
    ->     word5 varchar(255),
    ->     word6 varchar(255),
    ->     word7 varchar(255),
    ->     word8 varchar(255),
    ->     word9 varchar(255),
    ->     word10 varchar(255)
    ->     );
Query OK, 0 rows affected (0.22 sec)

mysql> insert into **.** values
    -> (1, '关键词01','关键词02','关键词03','关键词04','关键词05','关键词06','关键词07','关键词08','关键词09','关键词10');
Query OK, 1 row affected (0.05 sec)

mysql> select * from **;
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| id   | word1       | word2       | word3       | word4       | word5       | word6       | word7       | word8       | word9       | word10      |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
|    1 | 关键词01    | 关键词02    | 关键词03    | 关键词04    | 关键词05    | 关键词06    | 关键词07    | 关键词08    | 关键词09    | 关键词10    |
+------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
1 row in set (0.00 sec)

大丈夫そうです。CHARは固定長、VARCHARは可変長ですね。

中国語(ISO 3166-1 alpha-2:cn)の文字コード

中国語には繁体字と簡体字があり、繁体字は主に台湾、簡体字は主に本土で使われている。
character set
繁体字:big5
簡体字:GB2312 (EUC_CN)

では、文字コードはどう使われているのでしょう?

国工商银行

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>欢迎光临中国工商银行东京网站</title>
<META name="ICBCChannel" content="海外分行">

Agricultural bank of china

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
    <title>中国农业银行</title>

中国银行
Bank of china

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>中国银行全球门户网站</title>
<meta content="中国银行,中行,银行,保险,基金管理,直接投资,投资管理,飞机租赁,外汇,理财,金融,网上银行,网银,电子银行,手机银行,公司金融,个人金融,银行卡" name="keywords" />

どこもcharsetはutf-8で問題なさそうです。

Localization Tools

project preparation tools
project execution tools
quality assurance tools

translation management, terminology management, translation memory management

In-Context Match Exact(ICE): 100% with context
Exact Match:100%
Fuzzy Match: lower than 99%
No Match: all new words

Localize process

1. Product Preparation
2. Project Preparation
3. Project Execution
4. Quality Assessment

Language tiering
chinese, spanish, english, hindi, arabic, russian

Decide languages -> understand issues and solution -> start designing the app

internationalization:
Process of generalizing a product to handle multiple languages and cultural conventions. It happens during software development.
Design & Engineering, Testing
Density and Fonts, Layout, Spacing, Message Description, Dates, currencies, units, addresses, phone numbers, Plurals and Genders

Psuedolocalization:
Simulation of localized text by replacing source text with fake characters, Improperly Mirrored Interface

Untranslated Text, Not enough space

Project preparation: Project Evaluation: quotation, schedule
overview of project, estimated number of words, deliverables, deadlines, costs

Localization kit
-content to be translated, terminology, translation memories, style guide, reference material
Glossary: Contain the terms that are commonly used in the project and that need to be consistent throughout.
Translation Memory: database that contains all the previously translated segments from a product
Style guide: document outlining a set of standards and best practices in terms of how to handle a specific project

Project execution: Translation
Quality Evaluation Tools

Localization Project

e.g. google Product team
– develop
– introduce new features
– introduce new versions

-> google localization team
-> localization production, language services, vendor management, localization operations
Localization Project Manager(LPM)
-> external localization company: language service provider(LSP)
Language managers(Spanish, Hindi, Traditional Chinese)
Lastly product team launch service globally

Localization operations
– technology, business
Vendor Management finds LSPs, builds relationships

-Requesters, localization project managers, language mangers, localization operations, vendor mangement, external language service providers

User Interface

User Interface:
The space where interactions between humans and machines occur.

e.g. ATM
Withdraw, Deposit, Manage Accounts, Account Blance

Desktop software: applications you download and install onto laptop or desktop computers
Web apps: Similar to desktop applications, but they run on mobile phones and have different considerations
Mobile apps: Similar to traditional software, to use them, you don’t need to install

Where?
UI is found on desktop, web and on the phone.
Who?
UI is used by new users as well as users familiar with the product.
Why?
UI enable users to accomplish goals.
-> lack of product knowledge
Providing message descriptions, Providing Reference material and guidelines

Research what people search in local.
Telefoni Cellulari, Cellulari

English, Turkish, Turkish back translation
hotel chain, otel zinciri, hotel chain
holidays, tail, holiday vacation
hospitality, misavirperverlik, generosity
istanbul hotels, istanbul otelleri, istanbul hotels

localization

In house: engineers, translation team, project manager

Marketing Content: Engaging, Persuasive, Well Written
– First contact with a product
– Used by anyone interested in the product
– Designed to attract potential users

Online help
– FAQS
– Software Documentation
– Troubleshooting Manuals

Product knowledge, Consistent Terminology

Where?
videos are in online help pages, marketing pages, training platforms etc.

Revoicing
voice-over, dubbing, narration, audio description, free commentary

Dubbing:Actors’ voices are recorded over the original audio track
Subtitling:Written translation of spoken words and on-screen text