Python – Page 10 – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

PythonでQRコード作成

$ pip3 install qrcode
$ pip3 install pillow
qrcodeをinstallすると、qrコマンドが使える様になる

$ qr “text for qrcode” > qrcode.png

PythonでQRコードを作成

import qrcode

img = qrcode.make('test text')
img.save('qrcode_test.png')

qrコードの色やback colorを指定

qr = qrcode.QRCode(
version=12,
error_correction=qrcode.constants.ERROR_CORRECT_H,
box_size=2,
border=8
)
qr.add_data(‘test text’)
qr.make()
img = qr.make_image(fill_color=”red”, back_color=”blue”)
img.save(“qrcode_test2.png”)
[/code]

背景に画像を設定する方法がわからんな…

[Python] scrapyによるスクレイピング

scrapyとはweb scraping用のフレームワーク

$ pip3 install scrapy
$ scrapy version
Scrapy 2.5.1
$ scrapy startproject test1

$ cd test1
$ scrapy genspider test2 https://paypaymall.yahoo.co.jp/store/*/item/y-hp600-3/
Created spider ‘test2’ using template ‘basic’ in module:
test1.spiders.test2

test1/items.py

import scrapy
class Test1Item(scrapy.Item):
    title = scrapy.Field()

test1/spiders/test2.py

import scrapy
from test1.items import Test1Item

class Test2Spider(scrapy.Spider):
    name = 'test2'
    allowed_domains = ['paypaymall.yahoo.co.jp/store/*/item/y-hp600-3/']
    start_urls = ['https://paypaymall.yahoo.co.jp/store/*/item/y-hp600-3//']

    def parse(self, response):
        return Test1Item(
        		title = response.css('title').extract_first(),
        	)

$ scrapy crawl test2

BS4で良いじゃんと思ってしまうが、どうなんだろうか。

[Python] BeautifulSoupによるスクレイピング

$ sudo pip3 install bs4

import requests
from bs4 import BeautifulSoup

response = requests.get('https://paypaymall.yahoo.co.jp/store/*/item/y-hp600-3/')
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.find('title').get_text()
print(title)

$ python3 main.py
年内配送OK カニかに蟹ズワイガニカニ刺身かにしゃぶ殻Wカット生本ズワイ3箱セットカニ8規格選べる最大3kg かに脚フルポーション剥身姿カニ爪かに鍋越前かに問屋ますよね公式ストア – 通販 – PayPayモール

urllibより操作が簡単そうですな

[Python] urllib.requestによるスクレイピング

from urllib import request

response = request.urlopen('https://paypaymall.yahoo.co.jp/store/*/item/y-hp600-3/')
content = response.read()
response.close()
html = content.decode()

title = html.split('<title>')[1].split('</title')[0]
print(title)

なるほど

[Python] ファイルの小文字を一括して大文字に変更したい

目的: 4000行ぐらいのテキストファイルの中身を全て小文字から大文字にしたい

まず大文字に変更

text = "train/dr1/fcjf0/si1027.wav";

print(text.upper())

$ python3 main.py
TRAIN/DR1/FCJF0/SI1027.WAV

ファイルを読み込む

with open("file.txt") as f:
	for line in f:
		line = line.rstrip()
		print(line)

uppercaseでファイルに書き込む

file = open('myfile.txt', 'a')

with open("file.txt") as f:
	for line in f:
		line = line.rstrip()
		file.write(line.upper() + "\n") 
		print(line.upper())

TRAIN/DR1/FCJF0/SI1027.WAV
TRAIN/DR1/FCJF0/SI1657.WAV
TRAIN/DR1/FCJF0/SI648.WAV

SincNetっていう音声認識のOSSをgit cloneして動かそうとした際に、timitが必要でunzipしたらフォルダファイルが大文字で、SincNetのプログラムでは小文字で処理してたので、SincNetのプログラムを4000行くらい一括して大文字に変換する必要があった。

うーむ、、、やってみると意外とすぐ出来た

pythonで音源分離

import warnings
warnings.simplefilter('ignore')
import nussl
import numpy as np
# from common import viz

mix = nussl.AudioSignal('pop.mp3')
# stft = signal1.stft()
repet = nussl.separation.primitive.Repet(mix)
repet.run()
repet_bg, repet_fg = repet.make_audio_signals()

# Will run the algorithm and return AudioSignals in one step
repet = nussl.separation.primitive.Repet(mix)
repet_bg, repet_fg = repet()

repet_bg.istft()
repet_bg.embed_audio()
repet_fg.istft()
repet_fg.embed_audio()

repet_bg.write_audio_to_file('bg2.wav')
repet_fg.write_audio_to_file('fg2.wav')

パスポートの文字をtesseractで読み込む

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract


FILENAME = 'test.jpg'

text = pytesseract.image_to_string(Image.open(FILENAME))
lines = text.split("\n")
for line in lines:
	if '<<<' in line:
		t = line.replace(' ', '')
		print(t)

test1.png
P<GBRUK<SPECIMEN<<ANGELA<ZOE<<<<<<<<<<<<<<<<
5334755143GBR8812049F2509286<<<<<<<<<<c<<<04

test2.png
PDCYPPOLITIS<<ZINONAS<<<<<<<<<<<<<<KKKKKKKKK
FOOOOOD005CYP8012148M3006151<<<<<<<<<<<<<<02

test3.png
PTNOROESTENBYEN<<AASAMUND<SPECIMEN<<<<<<<<<<
FHCO023539NOR5604230M2506126<<<<<<<<<<<<<<00

なぜか変な文字列が混ざりますね

def convert(Filename):
	img = Image.open(Filename)
	img=img.convert('RGB')
	size=img.size
	img2=Image.new('RGB',size)
	 
	border=110
	 
	for x in range(size[0]):
	    for y in range(size[1]):
	        r,g,b=img.getpixel((x,y))
	        if r > border or g > border or b > border:
	            r = 255
	            g = 255
	            b = 255
	        img2.putpixel((x,y),(r,g,b))

	return img2

text = pytesseract.image_to_string(convert('test.jpg'))
lines = text.split("\n")
for line in lines:
	if '<<<' in line:
		print(line)

うーむ、ちょっと違うかな

[音声認識] librosaで音声の波形を描写したい

まず、mp3の音声ファイルを用意します。

ubuntuにlibrosaをinstall
$ pip3 install librosa
$ sudo apt-get install libsndfile1
$ sudo apt install ffmpeg

import librosa
import numpy as np
import matplotlib.pyplot as plt

file_name = "./test.mp3"
y, sr = librosa.load(str(file_name))
time = np.arange(0, len(y)) / sr

plt.plot(time, y)
plt.xlabel("Time(s)")
plt.ylabel("Sound Amplitude")

plt.savefig('image.jpg',dpi=100)

うおおおおおおおお、なるほど

PythonでUnitTest

import unittest

def add(a, b):
	return a + b

class TestAdd(unittest.TestCase):

	def test_add(self):
		value1 = 3
		value2 = 5
		expected = 8

		actual = add(value1, value2)
		self.assertEqual(expected, actual)

if __name__ == "__main__":
	unittest.main()

unittest.main()を実行すると、対象スクリプトのなかでスクリプト内でunittest.TestCaseを継承した全てのクラスを認識し、そのメソッドのうちtestで始まる名称を持つものが全てテストケースとして実行

なるほどー

[Python3] 保存している文章の最後の文章のみS3にupload

f.readlines()は1行ずつ
f.read()は全て

# -*- coding: utf-8 -*-
#! /usr/bin/python3

import boto3
import json
 
accesskey = ""
secretkey = ""
region = "ap-northeast-1"

f = open('2021-11-04.txt', 'r', encoding='UTF-8')

# data = f.read()
datalist = f.readlines()
for data in datalist:
	result = data

str = {
    "text":result,
}
 
with open("speech.json", "w") as f:
    json.dump(str, f, ensure_ascii=False)
 
  
s3 = boto3.client('s3', aws_access_key_id=accesskey, aws_secret_access_key= secretkey, region_name=region)
  
filename = "speech.json"
bucket_name = "speech-dnn"
  
s3.upload_file(filename,bucket_name,filename, ExtraArgs={'ACL':'public-read'})
print("upload {0}".format(filename))

{“text”: “今日イチの雄叫びとガッツポーズが此処まで聞こえてきます\n”}

良いね、あとはラズパイのcron設定