2014-09-29

matplotlibのPolar chartで重ね順を指定する

Python

棒グラフや凡例の重なり順についてはStack Overflowにかかれていますが、Polar chartの重なり順については慣れていない人では少し戸惑うかもしれませんのでメモ。

matplotlibにおける重なり順の指定にはzorderを使用します。

こんな感じのものを作る時。
f:id:CORDEA:20140929143312p:plain,h200

きちんと重なり順を指定しないとこうなることがあります。
f:id:CORDEA:20140929143632p:plain,h200

set_zorderで大きい値に指定したものが表に出ます。

sample

この場合は内側からプロットされるため、zorderを順番に9, 8, 7, 6, 5... というように指定しています。

import numpy as np
import matplotlib.pyplot as plt

theta = np.tile(0, 10)
radii = np.arange(2.0, 22.0, 2.0)
width  = np.pi*2
colors = theta

ax = plt.subplot(111, polar=True)
bars = ax.bar(theta, radii, width=width, bottom=0.0)

c = 0
for r, bar in zip(radii, bars):
    # ここでzorderの指定
    bar.set_zorder(10-c)
    if c % 2 == 0 or c == 0:
        bar.set_facecolor('#000000')
    else:
        bar.set_facecolor('#ffffff')
    bar.set_alpha(1.0)
    c += 1

plt.show()

余談ですがmatplotlibでplotした画像の背景が透過されたpngで保存したい場合はsavefigを使います。

plt.savefig('***.png', transparent=True)

参考

<a href="http://stackoverflow.com/questions/16770049/strange-matplotlib-zorder-behavior-with-legend-and-errorbar">strange matplotlib zorder behavior with legend and errorbar</a>
<a href="http://stackoverflow.com/questions/22019789/matplotlib-zorder-of-elements-in-polar-plot-superimposed-on-cartesian-plot">matplotlib zorder of elements in polar plot superimposed on cartesian plot</a>
artists — Matplotlib 1.4.0 documentation

2014-09-23

matplotlibと円周率でロゴ的なもの

Python

Wolfman Alphaのブログ記事を見ていたらふと円周率を使って何か作りたくなったのでmatplotlibを使って書いてみた。

結果として出来たものはこんな感じ

f:id:CORDEA:20140923181237p:plain,h200

桁数に応じて距離が遠くなり、数字に応じて(0-9)円が大きくなります。
桁数が増えるほど円が大きくなるよう調整しているのは外側に行くほど空白が増えるのが少し気に入らなかったからです。

円形でよさ気なものが出来たのでロゴにでも使おうかなぁと思っております。
　

コード

Gistにもあります。

import numpy as np
import matplotlib.pyplot as plt

pi="14159265358979323846264338327950288419716939937510582097494459230781640628620899862803482534211706798214"

lst = []
ind = []
for n in range(len(pi)):
    lst.append(int(pi[n]))
    ind.append(int(n+1))

lst = np.array(lst)
ind = np.array(ind)

r      = 2 * ind
theta  = (np.pi * ind) / 10
area   = 120 * lst * (ind * 0.012)
colors = theta

ax = plt.subplot(111, polar=True)
c  = plt.scatter(theta, r, c=colors, s=area, cmap=plt.cm.hsv, edgecolors='#636363')
c.set_alpha(0.75)

ax.spines['polar'].set_visible(False)
ax.axes.get_xaxis().set_visible(False)
ax.axes.get_yaxis().set_visible(False)

plt.show()

参考

Introducing Tweet-a-Program—Wolfram|Alpha Blog
Pi - Wolfram|Alpha
pie_and_polar_charts example code: polar_scatter_demo.py — Matplotlib 1.4.0 documentation
Adding new scales and projections to matplotlib — Matplotlib 1.4.0 documentation
python - turn off axis border for polar matplotlib plot - Stack Overflow
python - Hiding axis text in matplotlib plots - Stack Overflow
Python scatter plot. Size and style of the marker - Stack Overflow

2014-08-19

CodernityDBのご紹介

Database Python

今回はpure pythonのDatabase、CodernityDBのご紹介です。
日本語の紹介も無さそうでしたので簡単に紹介させていただきます。
Documentationとして作成しようかと思ったんですがそこまで纏められなかったので...
間違っている部分などありましたらご指摘いただけますと幸いです。

CodernityDBとは

Opensource
Native Python database
fast
マルチプラットフォーム
スキーマレス
複合インデックス(Multiple indexes)

などの特徴を持つNoSQLデータベースです。
　
　

Indexについて

現在、CodernityDBには大きく分けて2つのIndexが実装されています。

Hash Index

利点

　　速い

欠点

　　レコードはInsert/Update/Deleteの順番には並びません。
　　出来る処理は　特定のキーを問い合わせる or 全てのキーの反復処理　のみ
　

B Plus Tree Index

利点

　　レコードが順番に並びます(キーに依存します)
　　範囲クエリを問合せできます　

欠点

　　Hash based indexesよりも遅いです

formatについて

CodernityDBには多くのformatがあります。

公式:Key-format及びフォーマット文字 - Python Documentationを確認して下さい。

基本的な使い方

CodernityDBの使い方は非常に単純です。

ここではInsert/Count/Update/Delete/Getの5つについて実行するための簡単なコードを書いておきます。

ここには書いていない処理について、もしくはもっと詳しく知りたいような場合には公式:Quick tutorialをご覧下さい。
　
　

Insert(simple)

from CodernityDB.database import Database

db = Database('/tmp/tut1')
db.create()

insertDict = {'x': 1}
print db.insert(insertDict)

この方法は最も単純なInsertですが、_id fieldが自動生成され、特定のレコードを検索することは出来ません。

Insert

from CodernityDB.database import Database
from CodernityDB.hash_index import HashIndex

class WithXIndex(HashIndex):
    def __init__(self, *args, **kwargs):
        kwargs['key_format'] = 'I'
        super(WithXIndex, self).__init__(*args, **kwargs)

    def make_key_value(self, data):
        a_val = data.get("x")
        if a_val is not None:
            return a_val, None
        return None

    def make_key(self, key):
        return key

db = Database('/tmp/tut2')
db.create()

x_ind = WithXIndex(db.path, 'x')
db.add_index(x_ind)

print db.insert({'x': 1})

Count

from CodernityDB.database import Database

db = Database('/tmp/tut1')
db.open()

print db.count(db.all, 'x')

Get

from CodernityDB.database import Database

db = Database('/tmp/tut2')
db.open()

print db.get('x', 1, with_doc=True)

Delete

from CodernityDB.database import Database

db = Database('/tmp/tut2')
db.open()

curr = db.get('x', 1, with_doc=True)
doc  = curr['doc']

db.delete(doc)

Update

from CodernityDB.database import Database

db = Database('/tmp/tut2')
db.create()

curr = db.get('x', 1, with_doc=True)
doc  = curr['doc']

doc['Updated'] = True
db.update(doc)

Tips

error

raise IndexConflict("Already exists")

db.create, db.add_indexで既にある場合に発生するエラーです。

struct.error: 'I' format requires 0 <= number <= 4292967295

indexが4GBを超えた時に発生するエラーです。
エラーが発生した場合には、formatを'Q'に変更したり、indexを変更するなどの対応が必要です。

もし、db.path/id_buckが4GBを超えている場合、それはあなたが作成したindexが4GBを超えているのではなく、CodernityDBがDefaultで作成するmain indexによるものです。この場合はdb.createにwith_id_index=Falseを指定することで解決できるかもしれません。これはmain indexを作成させないようにするオプションです。

ただし、with_id_index=Falseを指定した場合は、id indexをUniqueHashIndexを用いてformat 'Q'で作成するか、Sharded indexesを使用してシャーディングを行う必要があります。Sharded Indexでformat 'I'を用いる場合には、10 shardsで10*4GBのindexを持つことが出来ます。shardの数はsh_numで設定します。

UniqueHashIndexを使用した例

from CodernityDB.database import Database
from CodernityDB.hash_index import HashIndex, UniqueHashIndex

class BigIDIndex(UniqueHashIndex):
    def __init__(self, *args, **kwargs):
        kwargs['key_format'] = '<32s8sQIcQ'
        super(BigIDIndex, self).__init__(*args, **kwargs)

class MyIDIndex(HashIndex):
    def __init__(self, *args, **kwargs):
        kwargs['key_format'] = 'Q'
        super(MyIDIndex, self).__init__(*args, **kwargs)

    def make_key_value(self, data):
        a_val = data.get("x")
        if a_val is not None:
            return a_val, None
        return None

    def make_key(self, key):
        return key

db = Database('/tmp/tut1')
db.create(with_id_index=False)

db.add_index(BigIDIndex(db.path, 'id'))
db.add_index(MyIDIndex(db.path, 'x'))

db.insert({'x': 1})

ValueError: bad marshal data, TypeError: 'str' object does not support item assignmentが発生することを確認していますが、今のところ解決策を見つけられていません。
膨大な量のデータをInsertする場合にはSharded Indexを利用するか、データを分けることを検討した方が良いかもしれません。

Sharded Indexを使用した例

from CodernityDB.sharded_hash import ShardedUniqueHashIndex, ShardedHashIndex
from CodernityDB.tree_index import TreeBasedIndex

class CustomIdSharded(ShardedUniqueHashIndex):
    custom_header = 'from CodernityDB.sharded_hash import ShardedUniqueHashIndex'
    def __init__(self, *args, **kwargs):
        kwargs['sh_nums'] = 10
        super(CustomIdSharded, self).__init__(*args, **kwargs)

class TreeIndex(TreeBasedIndex):
    def __init__(self, *args, **kwargs):
        kwargs['node_capacity'] = 10
        kwargs['key_format'] = 'I'
        super(TreeIndex, self).__init__(*args, **kwargs)

    def make_key_value(self, data):
        t_val = data.get('x')
        if t_val is not None:
            return t_val, None
        return None

    def make_key(self, key):
        return key

db = Database('/tmp/tut1')
db.create(with_id_index=False)

db.add_index(CustomIdSharded(db.path, 'id'))
db.add_index(TreeIndex(db.path, 'x'))

db.insert({'x': 1})

このエラーについて、詳しくはこちらを参照して下さい。

参考

CodernityDB pure python, fast, NoSQL database — CodernityDB
codernity / CodernityDB / issues / #14 - id_stor not greater than 4G — Bitbucket
7.3. struct — Interpret strings as packed binary data — Python v2.7.8 documentation
CodernityDB, Pure Python NoSQL database - YouTube

2014-08-18

自宅サーバーのFlaskにドメイン設定してみる

AWS Linux Python

はじめに

今後同じようなことをしたときに忘れていそうなのでメモ。
今回は自分なりにやりやすい方法を選んだ結果、かなり環境を限定されるやり方となってしまいました。
やり方はいろいろあるので、このようなやり方もある、という程度で留め置いていただければ幸いです。
機会があれば追記して再送します。

諸環境

CentOS 7
- Python 2.7.5
  - flask 0.10.1
フレッツ・光プレミアム

やってること

お名前.com

　　　ドメイン取得

AWS Route 53*1

　　　ドメイン管理

CTU *2

　　　ポート変換

自宅サーバー

　　　iptablesでポートの開放

手順

Flaskに関しては5000番ポートで使用します。

Hosted Zoneはあらかじめ作成しておきます。
Hosted Zoneの作成の仕方についてはawsdocumentaion*3に詳しく説明があります。

ここまでで以下の2点は終わっていることを前提としています。

domainの取得
Hosted Zoneの作成

name serverの変更

Hosted Zoneの作成を終えたら、Hosted Zoneの一覧にあるこれから使用するdomainをクリックしてHosted Zone DetailsのDelegation Setを確認します。
f:id:CORDEA:20140818173440p:plain

先ほど確認したDelegation Setのname server情報をお名前.comに反映させます。
場所は以下の通り
ドメインNavi -> ドメイン設定 -> ネームサーバーの設定 -> ネームサーバーの変更 -> 他のネームサーバーを利用

変更する際には使用したいドメイン名にチェックを入れてください。*4

CTUの設定

次はCTUの設定ですが、これはフレッツ・光プレミアムの方限定となります。
他のサービスでどのように設定するかは私には分かりかねますが、別にCTUで設定しなくてもiptablesでport変換するという手もあると思います。

画像を見ていただくとわかるかと思いますが、WAN側IP(global ip)の80番ポートへのアクセスをサーバーに割り当てているLAN側IPの5000番ポートに変換しています。

Record Setの作成

ここまで終わったらHosted ZoneのRecord Setを作成します。
設定は画像のとおり

Type	A
Alias	No
Value	WAN側IP

　
以上です。

参考

80番ポートへ届いたパケットをiptablesでローカルの上位ポートに転送する - Qiita

*1:別にAWSを使う必要はないんですが、契約しているドメインを全てRoute 53で管理してるので私はRoute 53を使用します。

*2:「フレッツ・光プレミアム」で設置される終端装置

*3:Creating a Hosted Zone - Amazon Route 53

*4:画像の丁度上にチェックボックスがあります

2014-07-22

【Debian】Rのpackage "rgl"インストール時に出たエラーと対策

Linux Debian R

CentOS / Fedoraはこちら

Debianでやってたら過去に見たようなエラーが出たのですが解決したのでまとめ。

手順

出たエラーはこちら

error: X11 not found but required, configure aborted.

error: missing required header GL/gl.h

以下をインストールしたら無事に”rgl”を入れることが出来た

sudo apt-get install -y libx11-dev freeglut3-dev

参考

How to solve the error " missing required header GL/gl.h" while installing the Package mvoutlier in R? - Stack Overflow

CentOSでR言語を使ってみたことのまとめ - Yuta.Kikuchiの日記

2014-07-21

CentOS 7 にyumで python-pip をインストールする

Linux Python

CentOSに7が出たということを知り、早速新しく買ったサーバーに入れてみたはいいもののyumからpipが入らない
でもeasy-installは使いたくない

というわけでyumからpipをinstallします。

Version

CentOS Linux release 7.0.1406 (Core)

手順

wget http://ftp-srv2.kddilabs.jp/Linux/distributions/fedora/epel/beta/7/x86_64/epel-release-7-0.2.noarch.rpm

sudo rpm -ivh epel-release-7-0.2.noarch.rpm

sudo yum install -y python-pip

おわり

2014-07-09

MySQLのMEMORYストレージエンジンを試してみる

Database

はじめに

見ていて面白そうだったので試験的にMEMORYストレージエンジンを使用してみました。

MEMORYストレージエンジンはメモリ上にテーブルを作成するものです。
これにより、かなり処理が高速化されます。

MySQLのインストールは終わっているものとして書いて有ります。

手順

　
先にheap size等の設定を行います。
これはインメモリにする上で必須の設定です。

sudo vim /etc/my.cnf

my.cnfに次の記述を追加します。

max_heap_table_size=2G
tmp_table_size=2G

できたらrestartします。

sudo service mysqld restart

それが終わったらMySQLにログインして設定を行います。

set global tmp_table_size = 2147483648;
set global max_heap_table_size = 2147483648;

ここまでできたらtableを作成してみます。

create table in_memory (id varchar(16), chr varchar(8), index(id)) engine=memory;

　
ここらへんは適当に。

load data infile "/tmp/test.csv" into table in_memory fields terminated by ',';

簡単な速度計測

　
tableの情報はこんな感じです。

Name	Engine	Version	Row_format	Rows	Avg_row_length	Data_length	Max_data_length	Index_length
my_isam	MyISAM	10	Dynamic	39706716	23	915841192	281474976710655	536508416
in_memory	MEMORY	10	Fixed	39706716	27	1281039008	1207959534	640502960

　
で、計測結果ですが

Name	Speed(sec)
my_isam	0.04
in_memory	0.00

　
正直なところ使うデータセットが悪かったので違いがわかりにくいですが、2-4倍程度の速度向上が見込めるかと思います。

最後に

サーバー再起動/クラッシュするとtableのデータが全て消えるので、実用できるかというと微妙なところですが、なかなかおもしろいと思います。

参考

MySQL :: MySQL 5.1 リファレンスマニュアル :: 13.7 MEMORY (HEAP) ストレージエンジン
 MySQL5 MEMORY ストレージエンジンの活用 | QuickKnowLedge
How to make the mysql MEMORY ENGINE store more data? - Stack Overflow