2015-05-22

【Python】画像の寒色・暖色を判定するプログラム

Python

画像の寒色・暖色を判断するプログラムというのはあまり無いので書いてみた。

stackoverflowの投稿をもとに、rgb -> hsv に変換し、hの範囲で判断する。
結局のところ、どこまでの寒色・暖色とするかは用途や人によって違うと思うので、とりあえずパーセンテージで出力。
いくつか試してみたけど予想以上に上手く行っているので驚いた。

#!/usr/bin/env python
# encoding:utf-8
#
# Copyright [2015] [Yoshihiro Tanaka]
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

__Author__ =  "Yoshihiro Tanaka <contact@cordea.jp>"
__date__   =  "2015-05-16"

from PIL import Image
import sys, colorsys

def main(filename):
    image = Image.open(filename)

    width, height = image.size
    pixel = image.load()

    data = []
    for w in range(width):
        for h in range(height):
            r, g, b = [r/255.0 for r in pixel[w, h]]
            data.append((colorsys.rgb_to_hsv(r, g, b)[0])*255.0)
    
    warmcool = [0, 0]
    for h in data:
        if 0 <= h <= 80 or 330 <= h <= 360:
            warmcool[0] += 1
        else:
            warmcool[1] += 1

    per = (warmcool[0] / float(sum(warmcool))) * 100
    print("warm: %f %%" % per)

if __name__=='__main__':
    main(sys.argv[1])

参考

RGB range for cold and warm colors? - Stack Overflow

2015-04-19

AFNetworking+OnoでXMLを取ってきてパースする

Objective-C

調べたらいっぱい出てくるんですけど、若干やり方変わったようなのでメモ。

何も考えずAFXMLParserとか使うとNSXMLParserが返ってくるので若干その後の処理が面倒になりそうな様子。

#import "XMLParser.h"
#import <AFNetworking/AFNetworking.h>
#import <AFOnoResponseSerializer/AFOnoResponseSerializer.h>
#import <Ono/Ono.h>

@implementation XMLParser

// method: 1
- (void) parseXMLUsingAFNetworkingAndOnoFirst: (NSString *)url {
    AFHTTPRequestOperationManager *manager = [AFHTTPRequestOperationManager manager];
    manager.responseSerializer = [AFOnoResponseSerializer XMLResponseSerializer];
    
    [manager
     GET:url
     parameters:nil
     success:^(AFHTTPRequestOperation *operation, ONOXMLDocument *responseDocument) {
         // ONOXMLDocument
         NSLog(@"responseDocument: %@", [responseDocument class]);
         // ref. https://github.com/AFNetworking/AFOnoResponseSerializer
     } failure:^(AFHTTPRequestOperation *operation, NSError *error) {
         NSLog(@"failure: %@", error);
     }];
}

// method: 2
- (void) parseXMLUsingAFNetworkingAndOnoSecond: (NSString *)url {
    AFHTTPRequestOperationManager *manager = [AFHTTPRequestOperationManager manager];
    manager.responseSerializer = [AFHTTPResponseSerializer serializer];
    
    [manager
     GET:url
     parameters:nil
     success:^(AFHTTPRequestOperation *operation, id responseObject) {
         // _NSInlineData
         NSLog(@"responseObject: %@", [operation.responseData class]);
         // HTMLDocumentWithDataとか適当に
     } failure:^(AFHTTPRequestOperation *operation, NSError *error) {
         NSLog(@"failure: %@", error);
     }];
}

// NSXMLParser使う場合
- (void) parseXMLUsingAFNetworking: (NSString *)url {
    AFHTTPRequestOperationManager *manager = [AFHTTPRequestOperationManager manager];
    manager.responseSerializer = [AFXMLParserResponseSerializer serializer];
    
    [manager
     GET:url
     parameters:nil
     success:^(AFHTTPRequestOperation *operation, id responseObject) {
         // NSXMLParser
         NSLog(@"responseObject: %@", [responseObject class]);
     } failure:^(AFHTTPRequestOperation *operation, NSError *error) {
         NSLog(@"failure: %@", error);
     }];
}

@end

どうでも良いがobj-cのシンタックスハイライトが微妙すぎて笑える。
久々にブログ書いて疲れたので大した説明もなしで終わり。

参考

AFNetworking · GitHub
Objective-C - Ono '斧' を触ってみた - Qiita

2015-02-22

【pylearn2】自分のデータセットを使ってカンタンにGRBMしよう

Python

はじめに

pylearn2というdeep learning libraryは、installしていくつかのサンプルを動かすだけなら割と簡単です。

ただ、いざ自分の用意したデータセットを使用してdeep learningさせようと思うと意外に大変。

というわけで可能な限り簡単に自分のデータセットを使ってGRBM(Gaussian restricted Boltzmann machine)を行うためのパイプラインを作成しました。
なんか間違ってたら適当に修正して下さい。

hoge_dataset.pyとgrbm.yamlはこちらのプログラムにいくつか私が変更を加えたものです。
私が作成したものではないパラメータ等ありますので、元のリポジトリもご参照下さい。

<a href="https://github.com/CORDEA/use_images_in_pylearn2">CORDEA/use_images_in_pylearn2</a>github.com

方法

pylearn2のinstallはいろんなところで書かれていますので割愛します。
　

自分のデータセットを作成

識別したい画像を用意してこんな感じで配置
配置する場所はPYLEARN2_DATA_PATH内にディレクトリ作ってその中に
今回は ${PYLEARN2_DATA_PATH}/train_test 内に in ディレクトリを作成したものとして書いています
　

in ディレクトリの名前が紛らわしいと感じたならconvert_image.pyの_DIRを修正して下さい。
　

└── in
    ├── class_1
    │   ├── 1.jpg
    │   └── 2.jpg
    ├── class_2
    │   ├── 1.jpg
    │   └── 2.jpg
    └── class_3
        ├── 1.jpg
        └── 2.jpg

% mv in ${PYLEARN2_DATA_PATH}/train_test/

下準備

% PYLEARN2_INSTALL_DIR=$HOME # installした場所
% cd $PYLEARN2_INSTALL_DIR/pylearn2/pylearn2/scripts/tutorials/ # 別にどこでもいいです
% git clone https://github.com/CORDEA/use_images_in_pylearn2.git
% cd use_images_in_pylearn2
% mv *.py $PYLEARN2_INSTALL_DIR/pylearn2/pylearn2/datasets/

実行

% train.py grbm.yaml

重みの可視化

% show_weight.py grbm.pkl

設定値など

ディレクトリをtrain_testとして書いてありますので、適宜読み替えて下さい。

各種パラメータについて

grbm.yamlにおけるパラメータで大体設定できるようにしてあります。

which_set
- csvの名前になります
base_path
- 自分のデータセットを置いたディレクトリへのPATHです
image_to_csv
- Trueにすると in ディレクトリにあるimageをcsvにしてから学習に移ります
image_size
- imageの大きさです。defaultで128ですがコンピュータの処理性能等や目的に応じて。
color
- defaultでFalseです。Trueにすると色情報を持ったcsvになりますが、次元数が3倍に。
save
- defaultでFalseです。TrueにするとnpyファイルをPYLEARN2_DATA_PATH/train_test/に保存します。既にnpyファイルが存在する場合はcsvファイルではなくnpyファイルを読み込みます。

作成されるファイルについて

${PYLEARN2_DATA_PATH}/train_test/train.csv
- imageから作成したcsvファイル
${PYLEARN2_DATA_PATH}/train_test/comparative_table.name
- labelとディレクトリ, 画像名の対応表。いらないような気がします
${PYLEARN2_DATA_PATH}/train_test/*.npy
- numpyのファイル, 詳しくはhoge_dataset.pyを参照して下さい。saveがTrueである場合のみ作成及び読み込みを行います。

参考

laughing/grbm_sample · GitHub
男のための機械学習〜RBMでA◯女優さんの共通特徴量を得よう〜 - 新kensuke-miの日記

2015-01-26

Caffe, Pylearn2をそれぞれinstallしたDockerコンテナをDocker Hubに公開した

Python docker

QiitaにてCaffe, Pylearn2のinstallに関する記事を投稿したところTwitterで次のような反応を頂きました。

やはり準備が鬼門すぎる。インストール済みのDockerコンテナとかがあればな・・・ Caffe, Pylearn2をまとめて試す by @_Cordea on @Qiita http://t.co/ktKeIHeTxD
— ピクシィ (@icoxfog417) January 25, 2015

これは私も感じていたことで、installが一番の関門ではないにしろ、「ちょっと使ってみたい」というユーザーを阻むには十分すぎる障壁だろうと思います。

もちろん、Pylearn2はVagrantに慣れていればVMが公開されているのでそちらを使用する手もあります。

一応Dockerfileも書いてはいる(動作確認はしていない)のですが、そもそもDockerfileを使用するにも時間が掛かるので、Docker Hubにコンテナを公開することにしました。

他の方も同じようなコンテナは公表しておられますので、そこらへんは好みで...

使い方とか

Caffe

% docker pull cordea/pycaffe

Caffeのmake, python wrapperに必要なライブラリのinstall, pathを通すところまで終了している状態のコンテナです。
Qiitaの記事で言うと"make"まで終了しています。

Pylearn2

% docker pull cordea/pylearn2

pylearn2のinstallとpathを通すところまで終了している状態のコンテナです。

2015-01-20

docker上のUbuntu 14.04にcaffeをinstall

docker Python

Caffe

Caffeはディープラーニングのフレームワークです。

<a href="https://github.com/BVLC/caffe">BVLC/caffe</a>
BVLC/caffe · GitHub

はじめに

今回はDocker上のUbuntuにCaffeをinstallします。
GPUが絡むと面倒なので、今回はCPUモードで使用します。
また、Pythonで使用するための設定やインストールは行いません。
そちらは本家のInstallationを参考にして下さい。

2015/01/26　追記

　CaffeのPython wrapperを使用するための手順を追記したものをQiitaに投稿しました。
　Pythonで使用される方はそちらをご覧ください。

Version

Docker version 1.3.2, build 39fa2fa/1.3.2
- Ubuntu 14.04.1 LTS

手順

ユーザーの作成

行わなくても問題ありません

% docker run -it --name="caffe" ubuntu /bin/bash
root@4536d063fc8a:/# adduser --disabled-password --gecos '' cordea
root@4536d063fc8a:/# adduser cordea sudo
root@4536d063fc8a:/# echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

installとか

root@4536d063fc8a:/# su cordea
cordea@4536d063fc8a:/$ cd home/cordea/
cordea@4536d063fc8a:~$ sudo apt-get update
cordea@4536d063fc8a:~$ sudo apt-get install git vim wget make bc libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev libblas-dev libatlas-base-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler
cordea@4536d063fc8a:~$ mkdir caffe
cordea@4536d063fc8a:~$ cd caffe/

CUDA

CPUモードではGraphics Driverは必要ありませんが、CUDA Toolkitは必要なようなのでToolkitだけinstallします。

cordea@4536d063fc8a:~/caffe$ wget http://developer.download.nvidia.com/compute/cuda/6_5/rel/installers/cuda_6.5.14_linux_64.run
cordea@4536d063fc8a:~/caffe$ chmod u+x cuda_6.5.14_linux_64.run 
cordea@4536d063fc8a:~/caffe$ ./cuda_6.5.14_linux_64.run 
Do you accept the previously read EULA? (accept/decline/quit): accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 340.29? ((y)es/(n)o/(q)uit): n
Install the CUDA 6.5 Toolkit? ((y)es/(n)o/(q)uit): y
Enter Toolkit Location [ default is /usr/local/cuda-6.5 ]: 
/usr/local/cuda-6.5 is not writable.
Do you wish to run the installation with 'sudo'? ((y)es/(n)o): y
Do you want to install a symbolic link at /usr/local/cuda? ((y)es/(n)o/(q)uit): y
Install the CUDA 6.5 Samples? ((y)es/(n)o/(q)uit): n
Installing the CUDA Toolkit in /usr/local/cuda-6.5 ...

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-6.5
Samples:  Not Selected

cordea@4536d063fc8a:~/caffe$ sudo ldconfig /usr/local/cuda-6.5/lib64/

makeとか

runtestでerrorが出るのはGPUが無いからだ...と思います

cordea@4536d063fc8a:~/caffe$ git clone https://github.com/BVLC/caffe
cordea@4536d063fc8a:~/caffe$ cd caffe/
cordea@4536d063fc8a:~/caffe/caffe$ cp Makefile.config.example Makefile.config         
cordea@4536d063fc8a:~/caffe/caffe$ make all
cordea@4536d063fc8a:~/caffe/caffe$ make test
cordea@4536d063fc8a:~/caffe/caffe$ make runtest
.build_release/test/test_all.testbin 0 --gtest_shuffle
libdc1394 error: Failed to initialize libdc1394
Cuda number of devices: 0
Setting to use device 0
Current device id: 0
Note: Randomizing tests' orders with a seed of 65626 .
[==========] Running 838 tests from 169 test cases.
[----------] Global test environment set-up.
[----------] 7 tests from SyncedMemoryTest
[ RUN      ] SyncedMemoryTest.TestInitialization
[       OK ] SyncedMemoryTest.TestInitialization (0 ms)
[ RUN      ] SyncedMemoryTest.TestAllocationCPU
[       OK ] SyncedMemoryTest.TestAllocationCPU (0 ms)
[ RUN      ] SyncedMemoryTest.TestCPUWrite
[       OK ] SyncedMemoryTest.TestCPUWrite (0 ms)
[ RUN      ] SyncedMemoryTest.TestGPUWrite
F0120 07:23:59.559131 31252 syncedmem.cpp:51] Check failed: error == cudaSuccess (35 vs. 0)  CUDA driver version is insufficient for CUDA runtime version
*** Check failure stack trace: ***
    @     0x2b0c0e68ddaa  (unknown)
    @     0x2b0c0e68dce4  (unknown)
    @     0x2b0c0e68d6e6  (unknown)
    @     0x2b0c0e690687  (unknown)
    @           0x70090b  caffe::SyncedMemory::mutable_gpu_data()
    @           0x5bf8ad  caffe::SyncedMemoryTest_TestGPUWrite_Test::TestBody()
    @           0x65a883  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x651327  testing::Test::Run()
    @           0x6513ce  testing::TestInfo::Run()
    @           0x6514d5  testing::TestCase::Run()
    @           0x654818  testing::internal::UnitTestImpl::RunAllTests()
    @           0x654aa7  testing::UnitTest::Run()
    @           0x41d480  main
    @     0x2b0c114b4ec5  (unknown)
    @           0x4244b7  (unknown)
    @              (nil)  (unknown)
make: *** [runtest] Aborted (core dumped)

試す

Tutorialにしたがって動作を確認します。

prototxtを編集しないとエラーが出ますので、"lenet_solver.prototxt"の一番下の行のGPUをCPUに変更して下さい。

cordea@4536d063fc8a:~/caffe/caffe$ vim examples/mnist/lenet_solver.prototxt

cordea@4536d063fc8a:~/caffe/caffe$ ./data/mnist/get_mnist.sh 
cordea@4536d063fc8a:~/caffe/caffe$ ./examples/mnist/create_mnist.sh
cordea@4536d063fc8a:~/caffe/caffe$ ./examples/mnist/train_lenet.sh

2015-01-09

GlueLangを動かしてみる。

とりあえず、私のMacでGlueLangのサンプルを動かすまでの話です。
Ubuntuでは上手くいくのですが、私のMacだと始めにちょっとこけたのでメモ。

2015/01/13　追記

　現在はLLVM 6.0でも動作することを確認しました。

GlueLang

　
<a href="https://github.com/ryuichiueda/GlueLang">ryuichiueda/GlueLang</a>
ryuichiueda/GlueLang · GitHub

Mac OS X (10.9.5)

使ってみる

% git clone https://github.com/ryuichiueda/GlueLang/
% cd GlueLang
% make
% ./glue EXAMPLE/if_then_else.glue

私の場合EXAMPLEのfizzbuzz.glueとif_then_else.glueがCommand errorで動作しなかった。

こけた原因と対処法

原因

おそらくgccとg++のversion

対処法

% ./glue EXAMPLE/if_then_else.glue

Execution error at line 5, char 3
        line5: ? /usr/bin/true
                 ^
        Command error

        process_level 1
        exit_status 127
        pid 34374

  ...

% brew tap homebrew/versions
% brew install gcc48
% sudo ln -sf /usr/local/bin/gcc-4.8 /usr/bin/gcc
% sudo ln -sf /usr/local/bin/g++-4.8 /usr/bin/g++

% make clean
rm -f glue Arg.o Command.o CommandLine.o Comment.o Element.o Environment.o Feeder.o IfBlock.o Import.o Pipeline.o Script.o TmpFile.o VarString.o main.o /usr/local/bin/glue

% make
g++ -Wall -O3 --static -std=c++11   -c -o Arg.o Arg.cc
g++ -Wall -O3 --static -std=c++11   -c -o Command.o Command.cc
g++ -Wall -O3 --static -std=c++11   -c -o CommandLine.o CommandLine.cc
g++ -Wall -O3 --static -std=c++11   -c -o Comment.o Comment.cc
g++ -Wall -O3 --static -std=c++11   -c -o Element.o Element.cc
g++ -Wall -O3 --static -std=c++11   -c -o Environment.o Environment.cc
g++ -Wall -O3 --static -std=c++11   -c -o Feeder.o Feeder.cc
Feeder.cc: In member function 'int Feeder::countIndent()':
Feeder.cc:476:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
  while(i < p->size()){
                    ^
g++ -Wall -O3 --static -std=c++11   -c -o IfBlock.o IfBlock.cc
IfBlock.cc: In member function 'virtual int IfBlock::exec()':
IfBlock.cc:136:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
  for(int i=0;i<m_nodes.size();i++){
                             ^
g++ -Wall -O3 --static -std=c++11   -c -o Import.o Import.cc
g++ -Wall -O3 --static -std=c++11   -c -o Pipeline.o Pipeline.cc
g++ -Wall -O3 --static -std=c++11   -c -o Script.o Script.cc
g++ -Wall -O3 --static -std=c++11   -c -o TmpFile.o TmpFile.cc
g++ -Wall -O3 --static -std=c++11   -c -o VarString.o VarString.cc
g++ -Wall -O3 --static -std=c++11   -c -o main.o main.cc
g++ -o glue Arg.o Command.o CommandLine.o Comment.o Element.o Environment.o Feeder.o IfBlock.o Import.o Pipeline.o Script.o TmpFile.o VarString.o main.o

% ./glue EXAMPLE/if_then_else.glue
OK
a
OK
OK

ここには書いておりませんがgcc,g++のVersionを上げた後、fizzbuzz.glueも正常に動くことを確認しています。

gcc, g++ Version

上げる前

% gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix
% g++ --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin13.4.0
Thread model: posix

上げた後

% gcc --version
gcc (Homebrew gcc48 4.8.3) 4.8.3
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% g++ --version
g++ (Homebrew gcc48 4.8.3) 4.8.3
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Ubuntu

普通に動作しているのでversionだけ。

Version

% uname -r
3.16.7-tinycore64
% cat /etc/issue
Ubuntu 14.04.1 LTS \n \l
% gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
% g++ --version
g++ (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

参考

ryuichiueda/GlueLang · GitHub
OS X 10.8にgcc 4.8をインストールする - Quartz

2014-12-28

niconicoのコンテンツ検索apiを用いたアニメ間距離の可視化　続き

前回は距離の算出やノイズに悩まされ、あまり良い結果が得られなかったのであの後もしばらく続けておりました。
これ以上放っておくと手順を忘れそうなので、ここらでまとめておきます。

とはいえ一ヶ月前くらいにやったことなので既に結構忘れていて余り書くことがない...すみません。

　
とりあえずプリキュアとかの続編がある程度固まってくれているのがちょっと嬉しいです。

前回からの変更点

2010-2014作品から、2000-2014年に取得範囲を変更
説明文にアニメタイトルを含むものまで取得していましたが、関係ないものが多くヒットするようなので動画タイトルかタグのいずれかにアニメタイトルを含むものだけを取得するように変更(getdataFromWikipedia.py)
動画件数が200件に満たないものは除外(getSearchResult.py)
取得した動画のタグに「アニメ」タグを含まないものは除外(tagParseJSONforCount.py)
タグの出現回数ではなく、アニメ1作品の総タグ数に占める対象タグの割合に変更(tagParseJSONforCount.py)
tf-idfを算出し、一定値に満たないタグは除外することで、特異的なタグが距離に反映されやすくなるように機能追加(calcTfidf.py)
あまり関係はないですが、出力の関係でTree ViewerをDendroscopeからEPoSに変更
他にもMeCabを使って形態素解析してMDSしてみたり(tagMeCabParseJSONforCount.py)

結果

Dendrogram
画像サイズが大きいのでご注意下さい。

距離の算出はピアソンです。

GitHubにnewickデータもおいておきます。

手順

python getSearchResult.py animes_2000-2014.txt
python tagParseJSONforCount.py animes > tags_anime_dataset.tsv
python calcTfidf.py tags_anime_dataset.tsv > tfidf_anime_dataset.tsv
awk -F "\t" '{if($2>=1.0){print $0}}' tfidf_anime_dataset.tsv > tfidf_1.0_anime_dataset.tsv
gawk -f cut_file.awk tfidf_1.0_anime_dataset.tsv tags_anime_dataset.tsv > tags_1.0_anime_dataset.tsv
R CMD BATCH hclust.R

コード類

GitHub

参考

sklearn.feature_extraction.text.TfidfVectorizer — scikit-learn 0.15.2 documentation
tf-idf - Wikipedia
scikit-learn で TF-IDF を計算する - Qiita
sh - awk: program limit exceeded: maximum number of fields size=32767 - Stack Overflow