python3 datetime

import datetime
print(datetime.date.today())

[vagrant@localhost python]$ python3 time.py
2018-07-28

ハイフンを失くしたい。

today = datetime.date.today()
print(type(today))

[vagrant@localhost python]$ python time.py

from datetime import datetime
import re

now = datetime.now()
print(str(now))

[vagrant@localhost python]$ python3 time.py
2018-07-28 16:14:29.145644

import datetimeはdatetimeからimport
from datetime import hogeはdatetimeのhogeからimport
from datetime import datetimeだとややこしいね。

python3でアメダス

import urllib.request

url = "https://www.jma.go.jp/jp/amedas/imgs/temp/000/201807281500-00.png"
savename = "image/amedas.png"

urllib.request.urlretrieve(url, savename)
print("保存しました")

201807281500のところは自動で出力したい。

python2とpython3を使えるようにしよう

[vagrant@localhost ~]$ python -V
Python 3.5.2
[vagrant@localhost centos6]$ python
Python 3.5.2 (default, Jul 28 2018, 11:25:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>>
[vagrant@localhost centos6]$ python3
Python 3.5.2 (default, Jul 28 2018, 11:25:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>>

時間かかった~ 2日くらい

import urllib.request

url = "http://hogehoge/img/hoge.png"
savename = "test.png"

urllib.request.urlretrieve(url, savename)
print("保存しました")

[vagrant@localhost python]$ python3 app.py
保存しました

おおおお、
では、ヤフオクのmacbookを取得します。

import urllib.request

url = "https://wing-auctions.c.yimg.jp/sim?furl=auctions.c.yimg.jp/images.auctions.yahoo.co.jp/image/dr000/auc0407/users/6ba85c7e48fb6a8607eca71d0b7b7d6113140ce8/i-img1100x983-1532495826yzd18p69701.jpg"
savename = "mac.png"

urllib.request.urlretrieve(url, savename)
print("mac book")

[vagrant@localhost python]$ python3 app.py
mac book

OK

うん、ヤフオクだと、商品画像のアルゴリズムがよくわからんな。
img1100x983が画像サイズでしょうね。
6ba85c7e48fb6a8607eca71d0b7b7d6113140ce8
i-img1100x983-1532495826yzd18p69701.jpg

no module named requests

[vagrant@localhost python]$ pip list
Package Version
————– ———
beautifulsoup4 4.6.0
certifi 2018.4.16
chardet 3.0.4
idna 2.7
libxml2-python 2.9.7
pip 18.0
requests 2.19.1
setuptools 38.5.2
urllib3 1.23
wheel 0.30.0
[vagrant@localhost python]$ python app.py
Traceback (most recent call last):
File “app.py”, line 3, in
import urllib3.requests
ImportError: No module named requests

# -*- coding: utf-8 -*-

import urllib3.requests

url = "http://uta.pw/shodou/img/28/214.png"
savename="test.png"

urllib3.requests.urlretrieve(url, savename)
print("保存しました。")

あれ?

SimpleHTTPServerでサーバーは立ち上がるが

app.py

import SimpleHTTPServer
SimpleHTTPServer.test()

サーバーを起動します。
[vagrant@localhost python]$ python app.py
Serving HTTP on 0.0.0.0 port 8000 …
192.168.33.1 – – [22/Jul/2018 15:47:18] “GET /test.py HTTP/1.1” 200 –

htmlファイルは表示されます。

test.py

# -*- coding: utf-8 -*-

print 'Content-type: text/html\n'
print """
<!DOCTYPE html>
<html>
<head><meta charset="utf8"><title>CGIスクリプト</title></head>
<body>
これはサーバの実行結果として生成されたHTMLです<br>
今日はです
</body></html>
"""

何故?文字化け?

192.168.33.1 – – [22/Jul/2018 16:58:46] “GET /cgi-bin/ HTTP/1.1” 403 –
192.168.33.1 – – [22/Jul/2018 17:00:00] “GET /cgi-bin/sample.py HTTP/1.1” 200 –
: そのようなファイルやディレクトリはありません
192.168.33.1 – – [22/Jul/2018 17:00:00] CGI script exit status 0x7f00
何故だ、こりゃわからんな。

日経平均株価をスクレイピング

import urllib2
from bs4 import BeautifulSoup

html = urllib2.urlopen("hogehoge")
soup = BeautifulSoup(html, "html.parser")
tag =soup.find("td", class_="hogehoge").string
print(tag.encode("utf-8"))

[vagrant@localhost python]$ python app.py
22,697.88

米ドル、ニューヨークダウ、上海も行きたい。
find_allで書くと、tracebackが出てくる。何故だ。

tag =soup.find_all("td", class_="hogehoge").string
print(len(tag))

[vagrant@localhost python]$ python app.py
Traceback (most recent call last):
File “app.py”, line 8, in
tag =soup.find_all(“td”, class_=”header_shisuu_atai1″).string
File “/home/linuxbrew/.linuxbrew/Cellar/python@2/2.7.14_4/lib/python2.7/site-packages/bs4/element.py”, line 1807, in __getattr__
“ResultSet object has no attribute ‘%s’. You’re probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?” % key
AttributeError: ResultSet object has no attribute ‘string’. You’re probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

soup.find_all(“a”)だと行けるので、find_all()の中がまずそうだ。

findとfind_allだと、書き方が違うのか。。。リストは行けるんだが。。
[

22,697.88

,

111.38

,

25,058.12

,

2,829.27

]

‘ascii’ codec can’t encode characters と表示されたとき

import urllib2
from bs4 import BeautifulSoup

html = urllib2.urlopen("https://www.monex.co.jp/")
soup = BeautifulSoup(html, "html.parser")
tag =soup.title.string
print(tag)

エンコードできないと表示された。
[vagrant@localhost python]$ python app.py
Traceback (most recent call last):
File “app.py”, line 9, in
print(tag)
UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 0-6: ordinal not in range(128)

こうなったら、タグを正規表現でreplaceしようと試したが上手くいかず、イライラMaxで数時間就寝。
改めて、試したら

import urllib2
from bs4 import BeautifulSoup

html = urllib2.urlopen("https://www.monex.co.jp/")
soup = BeautifulSoup(html, "html.parser")
tag =soup.title.string
print(tag.encode("utf-8"))

なんだ、エンコードを指定するのね♪
[vagrant@localhost python]$ python app.py
マネックス証券 | ネット証券(株・アメリカ株・投資信託)

大体気分転換すると上手くいくね。

beautiful soup4をインストールしよう

pipでbeautifulsoup4を入れます。

[vagrant@localhost python]$ pip install beautifulsoup4
Collecting beautifulsoup4
Downloading https://files.pythonhosted.org/packages/a6/29/bcbd41a916ad3faf517780a0af7d0254e8d6722ff6414723eedba4334531/beautifulsoup4-4.6.0-py2-none-any.whl (86kB)
100% |################################| 92kB 186kB/s
Installing collected packages: beautifulsoup4
Successfully installed beautifulsoup4-4.6.0
You are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the ‘pip install –upgrade pip’ command.

おおおお、入ったようだ。

ん、どうやら、python3系のコードを書いてしまったよう。
python app.py
Traceback (most recent call last):
File “app.py”, line 3, in
import requests, bs4
ImportError: No module named requests

やり直します。

import urllib2
from bs4 import BeautifulSoup

html = urllib2.urlopen("https://www.monex.co.jp/")
soup = BeautifulSoup(html)

[vagrant@localhost python]$ python app.py
/home/linuxbrew/.linuxbrew/Cellar/python@2/2.7.14_4/lib/python2.7/site-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I’m using the best available HTML parser for this system (“html.parser”). This usually isn’t a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 7 of the file app.py. To get rid of this warning, change code that looks like this:

BeautifulSoup(YOUR_MARKUP})

to this:

BeautifulSoup(YOUR_MARKUP, “html.parser”)

markup_type=markup_type))

html.prserが必要のようですね。

soup = BeautifulSoup(html, "html.parser")

これでOK

では、社長が大好きなマネックス(https://www.monex.co.jp/)を見てみましょう。

import urllib2
from bs4 import BeautifulSoup

html = urllib2.urlopen("https://www.monex.co.jp/")
soup = BeautifulSoup(html, "html.parser")

tag = soup.find("title")
print(tag)

おおおおおおおおおおおおおおおおおお、
ちゃんとスクレイピングできてます!

[vagrant@localhost python]$ python app.py
マネックス証券 | ネット証券(株・アメリカ株・投資信託)

import SimpleHTTPServer

SimpleHTTPServerをimportする。

[vagrant@localhost python]$ python
Python 2.7.14 (default, Mar 12 2018, 22:03:33)
[GCC 5.4.0 20160609] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import SimpleHTTPServer
>>> SimpleHTTPServer.test()
Serving HTTP on 0.0.0.0 port 8000 …
192.168.33.1 – – [21/Jul/2018 22:28:14] “GET / HTTP/1.1” 200 –
192.168.33.1 – – [21/Jul/2018 22:28:14] code 404, message File not found
192.168.33.1 – – [21/Jul/2018 22:28:14] “GET /favicon.ico HTTP/1.1” 404 –
192.168.33.1 – – [21/Jul/2018 22:28:18] “GET /app.py HTTP/1.1” 200 –

ん?どういうことだ?

pythonに慣れよう9 クラス

class Yamanote:
	pass

shinagawa = Yamanote()
shinagawa.city = "Minato-ku"
shinagawa.user = 370000
shinagawa.line = 24

shibuya = Yamanote()
shibuya.city = "Shibuya-ku"
shibuya.user = 3310000
shibuya.spot = "hachiko"

print(shinagawa.city)
print(shibuya.spot)

[vagrant@localhost python]$ python app.py
Minato-ku
hachiko

コンストラクタを使う。

class Yamanote:
	def __init__(self, city):
		self.city = city

shinagawa = Yamanote("Shinagawa-ku")
shibuya = Yamanote("Shibuya-ku")

print(shinagawa.city)
print(shibuya.city)

[vagrant@localhost python]$ python app.py
Shinagawa-ku
Shibuya-ku

クラス変数を呼び出す。

class Yamanote:
	count = 0
	def __init__(self, city):
		Yamanote.count += 1
		self.city = city

shinagawa = Yamanote("Shinagawa-ku")
shibuya = Yamanote("Shibuya-ku")
print(Yamanote.count)

ふむ。
[vagrant@localhost python]$ python app.py
2

メソッド

class Yamanote:
	count = 0
	def __init__(self, city):
		Yamanote.count += 1
		self.city = city
	def announce(self):
		print("This is " + self.city)

shinagawa = Yamanote("Shinagawa-ku")
shibuya = Yamanote("Shibuya-ku")

shinagawa.announce()
shibuya.announce()

わかるんだが、使っていかないと、慣れないね。
[vagrant@localhost python]$ python app.py
This is Shinagawa-ku
This is Shibuya-ku

@classmethod

class Yamanote:
	count = 0
	def __init__(self, city):
		Yamanote.count += 1
		self.city = city
	def announce(self):
		print("This is " + self.city)
	@classmethod
	def show_info(cls):
		print(str(cls.count) + "instances")

shinagawa = Yamanote("Shinagawa-ku")
shibuya = Yamanote("Shibuya-ku")

Yamanote.show_info()

あああああ
[vagrant@localhost python]$ python app.py
2instances

class のprivate, public

class Yamanote:
	def __init__(self, city):
		self.__city = city
	def announce(self):
		print("This is " + self.__city)

shinagawa = Yamanote("Shinagawa-ku")
shibuya = Yamanote("Shibuya-ku")

print(shinagawa.__city)

[vagrant@localhost python]$ python app.py
Traceback (most recent call last):
File “app.py”, line 12, in
print(shinagawa.__city)
AttributeError: Yamanote instance has no attribute ‘__city’
ほえ~

継承のsuperがうまくいかない。

class Yamanote:
	def __init__(self, spot):
		self.spot = spot
	def announce(self):
		print("Enjoy " + self.spot)

class Startup(Yamanote):
	def __init__(self, spot, company):
		super().__init__(spot)
		self.company = company
	def hello(self):
		print("What's up " + self.company)

harajyuku = Startup("takeshita","sm")
print(Harajyuku.spot)
Harajyuku.hello()

[vagrant@localhost python]$ python app.py
Traceback (most recent call last):
File “app.py”, line 16, in
harajyuku = Startup(“takeshita”,”sm”)
File “app.py”, line 11, in __init__
super().__init__(spot)
TypeError: super() takes at least 1 argument (0 given)
あ、python2系はエラーになるのね。。早くいってよ、もー

import math, random
print(math.pi)