折角なので、apple japanを見てみましょう。
https://www.apple.com/jp/
from bs4 import BeautifulSoup html = """ <html><body> <div class="ac-gf-directory-column"><input class="ac-gf-directory-column-section-state" type="checkbox" id="ac-gf-directory-column-section-state-products" /> <div class="ac-gf-directory-column-section"> <label class="ac-gf-directory-column-section-label" for="ac-gf-directory-column-section-state-products"> <h3 class="ac-gf-directory-column-section-title">製品情報と購入</h3> </label> <a href="#ac-gf-directory-column-section-state-products" class="ac-gf-directory-column-section-anchor ac-gf-directory-column-section-anchor-open"> <span class="ac-gf-directory-column-section-anchor-label">メニューを開く</span> </a> <a href="#" class="ac-gf-directory-column-section-anchor ac-gf-directory-column-section-anchor-close"> <span class="ac-gf-directory-column-section-anchor-label">メニューを閉じる</span> </a> <ul class="ac-gf-directory-column-section-list"> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/mac/">Mac</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/ipad/">iPad</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/iphone/">iPhone</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/watch/">Watch</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/tv/">TV</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/music/">Music</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/itunes/">iTunes</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/ipod-touch/">iPod touch</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/shop/goto/buy_accessories">アクセサリ</a></li> <li class="ac-gf-directory-column-section-item"><a class="ac-gf-directory-column-section-link" href="/jp/shop/goto/giftcards">ギフトカード</a></li> </ul> </div> </div> </body></html> """ soup = BeautifulSoup(html, 'html.parser') h3 = soup.select_one("div.ac-gf-directory-column-section > label.ac-gf-directory-column-section-label > h3") print("h3 =", h3.string) li_list = soup.select("div.ac-gf-directory-column-section > ul.ac-gf-directory-column-section-list > li") for li in li_list: print("li =", li.string)
[vagrant@localhost python]$ python3 app.py
h3 = 製品情報と購入
li = Mac
li = iPad
li = iPhone
li = Watch
li = TV
li = Music
li = iTunes
li = iPod touch
li = アクセサリ
li = ギフトカード
上手くいけてます。iPodの位置づけがよくわかりませんね。
h3の箇所は、
soup.select_one(“div.ac-gf-directory-column-section > label.ac-gf-directory-column-section-label > h3”)
と書いていますが、
soup.select_one(“div.ac-gf-directory-column-section > h3”)
と書くと、エラーになります。
Traceback (most recent call last):
File “app.py”, line 35, in
print(“h3 =”, h3.string)
AttributeError: ‘NoneType’ object has no attribute ‘string’
階層に沿って書かないと駄目ということでしょう。