algorithm – Page 3 – ソフトウェアエンジニアの技術ブログ：Software engineer tech blog

「この商品を買った人はこんな商品も買っています」を実装したい2

各人の注文データがある

$orders = [
	'yamada' => ['a','b','c'],
	'tanaka' => ['b','f'],
	'sato' => ['a','g'],
	'goto' => ['e','f'],
];

foreach($orders as $order => $products){
	foreach($products as $product){
		echo $product;
	}
	echo "<br>";
}

abc
bf
ag
ef

こうすると、一緒に買われている商品が配列で入る。

$orders = [
	'yamada' => ['a','b','c'],
	'tanaka' => ['b','f'],
	'sato' => ['a','g','b'],
	'goto' => ['e','f'],
];

foreach($orders as $order => $products){
	foreach($products as $product){
		$togethers = (array_diff($products, array($product)));
		foreach($togethers as $together){
			if($data[$product] != null){
				$data[$product] = array_merge($data[$product], array($together => 1));
			} else {
				$data[$product] = array($together => 1);
			}
			
		}
	}
}

array(6) {
[“a”]=>
array(3) {
[“b”]=>
int(1)
[“c”]=>
int(1)
[“g”]=>
int(1)
}
[“b”]=>
array(4) {
[“a”]=>
int(1)
[“c”]=>
int(1)
[“f”]=>
int(1)
[“g”]=>
int(1)
}
[“c”]=>
array(2) {
[“a”]=>
int(1)
[“b”]=>
int(1)
}
[“f”]=>
array(2) {
[“b”]=>
int(1)
[“e”]=>
int(1)
}
[“g”]=>
array(2) {
[“a”]=>
int(1)
[“b”]=>
int(1)
}
[“e”]=>
array(1) {
[“f”]=>
int(1)
}
}

keyの値を足して、一緒に買われている回数まで出したい。
おうおう、もうちょいだ

date型レコードの範囲内のYYYYMMの連続した配列を作る

mysql内に、 ‘2019-10-05′,’2020-01-05’,’2020-02-13’などのレコードがあった場合に、その範囲内の連続したYYYYMMの配列(2019-10, 2019-11, 2019-12, 2020-01)を作りたい時
-> 最初と最後のUnixTime範囲内で、1ヵ月づつ増えるFor文を作る。

        $last = Hoge::where('user_id', $id)->orderBy('time','ASC')->first();
        $last_month = substr($last['time'], 0, 7);
        $last_unix = strtotime('first day of ' . $last_month);
        $latest = Hoge::where('user_id', $id)->orderBy('time','DESC')->first();
        $latest_month = substr($latest['time'], 0, 7);
        $latest_unix = strtotime('last day of ' . $latest_month);

        for($time = $last_unix; $time <= $latest_unix; $time = strtotime('+1 month', $time)){
            $Ym[] = date('Ym', $time);
        }

取得したデータをそのままUnixTimeにして 1ヵ月づつ増やすと、　$last_month　+ 1 month > $latest_month となる場合があるので、$last_monthを月初の日付、$latest_monthを月末の日付に変換してからFor文を回す必要がある。

う〜ん、最初にDBのデータ型考える際に、ロジックの計算処理を考えてデータ型を決めれるようになるな。

Exclusive control

Exclusive control is the exclusive control of a resource when a conflict arises due to simultaneous access from multiple processes to shared resources that can be used by multiple processes in the execution of a computer program. While it is used as it is, it means the things of processing that maintains consistency by making other processes unavailable. Also called mutual exclusion or mutual exclusion. The case where up to k processes may access shared resources is called k-mutual exclusion.

マーチンゲールのアルゴリズムを考える

マーチンゲールとは？

追加のエントリーをする際、枚数(ロット)を増やしていく手法です。

例
1枚(1ロット)　↓
↓
2枚(2ロット)　↓
↓
4枚(4ロット)　↓
↓
8枚(8ロット)　↓
↓

というように、エントリーごとに枚数を「倍、倍」にしていきます。

100円で最初のポジションを持った後に下落し、
98円でポジションを持つ際に倍の2枚でエントリーします。

そうすることで、
100円 + 98円 + 98円 = 平均98.67円
というところまで、利益となるボーダーラインが下がってきます。
先ほどの「ナンピン」は99円でしたので、より早く利益確定を狙うことができます。

よって、一回ポジションを持ったら、利益になるまで諦めない手法と言えるでしょう。

つまり、(1)ポジションと逆にいった場合に、エントリーをするということだ。
よって、どれくらい逆にいった場合にエントリーするか、を設定する必要がある。
ナンピンの場合は、単純にエントリーだが、マーチンゲールの場合は、エントリーのロットが増える。よって、(2)最初のエントリーの倍のロットでエントリーする必要がある。

この(1)、(2)をコードに反映させる必要がある。
うむ。。。これ、以外と頭使うなー。

PHP Paging algorithm

Make variable number of items displayed on one page.
Prepare a number of array.

$coin = ["Bitcoin", "Ethereum", "Ripple", "Litecoin", "Ethereum Classic", "DASH", "Bitshares", "Monero", "NEM", "Zcash", "Bitcoin Cash", "Bitcoin Gold", "Monacoin", "Lisk", "Factom"];

$array_num = count($coin);
$page = 2;
$number = 4;

$i = 0;
foreach($coin as $value){
	if(($number * ($page - 1) - 1 < $i) && ($i < $number * $page)){
		echo $value . "<br>";
	}
	$i++;
}

-> Ethereum Classic
DASH
Bitshares
Monero

It’s ok to change the variables, page and page_number with button.
It was easier than I thought♪

Population density

def ensure_float(v):
	if is_number(v):
		return float(v)

def audit_population_density(input_file):
	for row in input_file:
		population = ensure_float(row['populationTotal'])
		area = ensure_float(row['areaLand'])
		population_density = ensure_float(row['populationDensity'])
		if population and area and population_density:
			calculated_density = population / area
			if math.fabs(calculated_density - population_density) > 10:
				print "Possibly bad population density for ", row['name']

if __name__ == '__main__':
	input_file = csv.DictReader(open("cities.csv"))
	skip_lines(input_file, 3)
	audit_population_density(input_file)

Using blue print

import xml.etree.cElementTree as ET
from collections import defaultdict
import re

osm_file = open("chicago_abbrev.osm", "r")

street_type_re = re.compile(r'\S+\.?$', re.IGNORECASE)
street_types = defaultdict(int)

def audit_street_type(street_types, street_name):
	m = street_type_re.search(street_name)
	if m:
		street_type = m.group()
		street_types[street_type] += 1

def print_sorted_dict(d):
	keys = d.keys()
	keys = sorted(keys, key=lambda s: s.lower())
	for k in keys:

def is_street_name(elem):
	return (elem.tag == "tag") and (elem.attrib['k'] == "addr:street")

def audit():
	for event, elem in ET.iterparse(osm_file):
		if is_street_name(elem):
			audit_street_type(street_types, elem.attrib['v'])
	print_sorted_dict(street_types)

if __name__ == '__main__':
	audit()

Scraping solution

from bs4 import BeautifulSoup

s = requests.Session()

r = s.get("http://www.transtats.bts.gov/Data_Elements.aspx?Data=2")
soup = BeautifulSoup(r.text)
viewstate_element = soup.find(id="__VIEWSTATE")
viewstate = viewstate_element["value"]
eventvalidation_element = soup.find(id="__EVENTVALIDATION")
eventvalidation = eventvalidation_element["value"]

r = s.post("http://www.transtats.bts.gov/Data_Elements.aspx?Data=2",
	data={'AirportList' : "BOS",
		'CarrierList' : "VX",
		'Submit' : "Submit",
		'__EVENTTARGET' : "",
		'__EVENTVALIDATION' : eventvalidation,
		'__VIEWSTATE' : viewstate})

f = open("virgin_and_logan_airport.html", "w")
f.write(r.text)

Parsing XML

import xml.etree.ElementTree as ET
import pprint

tree = ET.parse('exampleResearchArticle.xml')
root = tree.getroot()

print "\nChildren of root:"
for child in root:
	print child.tag

import xml.etree.ElementTree as ET
import pprint

tree = ET.parse('exampleResearchArticle.xml')
root = tree.getroot()

title = root.find('./fm/bibl/title')
title_text = ""
for p in title:
	title_text += p.text
print "\nTitle:\n", title_text

print "\nAuthor email addresses:"
for a in root.findall('./fm/bibl/aug/au'):
	email = a.find('email')
	if email is not None:
		print email.text

XLRD

#!/usr/bin/env python

import xlrd
from zipfile import zipfile
datafile = "2013_ERCOT_Hourly_Load_Data.xls"

def open_zip(datafile):
	with ZipFile('{0}.zip'.format(datafile),'r') as myzip:
		myzip.extractall()

def parse_file(datafile):
	workbook = xlrd.open_workbook(datafile)
	sheet = workbook.sheet_by_index(0)

	data = [[sheet.cell_value(r, col)
			for col in range(sheet.ncols)]
				for r in range(sheet.nrows)]

	cv = sheet.col_value(1, start_rowx=1, end_rowx=None)

	maxval = max(cv)
	minval = min(cv)

	maxpos = cv.index(maxval) + 1
	minpos = cv.index(minval) + 1

	maxtime = sheet.cell_value(maxpos, 0)
	realtime = xlrd.xldate_as_tuple(maxtime, 0)
	mintime = sheet.cell_value(minpos, 0)
	realmintime = xlrd.xldate_as_tupple(mintime, 0)

	data = {
		'maxtime':(0,0,0,0,0,0),
		'maxvalue': 0,
		'mintime': (0,0,0,0,0,0),
		'minvalue': 0,
		'avgcoast': 0
	}
	return data

def test():
	open_zip(datafile)
	data = parse_file(datafile)

	assert data['maxtime'] == (2013, 8, 13, 17, 0, 0)
	assert round(data['maxvalue'], 10) == round(18779.02551, 10)