世界の測量

Sibling of "Relevant, Timely, and Accurate, " but much lighter and shorter ※自らの所属する組織の見解を示すものでない

ICUのRuby実装twitter-cldr-rbの動作確認をする

i18n 界隈では、昔から ICU - International Components for Unicode が知られている。最近、このRuby実装である twitter/twitter-cldr-rb · GitHub を知ったので、このGithubページのREADMEにある Usage をなぞるだけであるが手元の OS X で動作確認してみた。

導入

$ gem install twitter_cldr
Fetching: camertron-eprun-1.1.0.gem (100%)
Successfully installed camertron-eprun-1.1.0
Fetching: twitter_cldr-3.0.3.gem (100%)
Successfully installed twitter_cldr-3.0.3
Parsing documentation for camertron-eprun-1.1.0
Installing ri documentation for camertron-eprun-1.1.0
Parsing documentation for twitter_cldr-3.0.3
Installing ri documentation for twitter_cldr-3.0.3
Done installing documentation for camertron-eprun, twitter_cldr after 8 seconds
2 gems installed

動作確認

irb を使って次のように確認を進めた。

irb(main):001:0> require 'twitter_cldr'
=> true

問題なく導入されている。

irb(main):002:0> TwitterCldr.supported_locales
=> [:af, :ar, :be, :bg, :bn, :ca, :cs, :cy, :da, :de, :el, :en, :"en-GB", :es, :eu, :fa, :fi, :fil, :fr, :ga, :gl, :he, :hi, :hr, :hu, :id, :is, :it, :ja, :ko, :lv, :ms, :nb, :nl, :pl, :pt, :ro, :ru, :sk, :sq, :sr, :sv, :ta, :th, :tr, :uk, :ur, :vi, :zh, :"zh-Hant"]

サポートされているロケール。:ja も問題なく含まれている。

irb(main):003:0> 2014.localize(:ja).to_s
=> "2,014"

日本の三桁区切りの入れ方。カンマを使う。

irb(main):004:0> 2014.localize(:de).to_s
=> "2.014"

ドイツの三桁区切りの入れ方。ピリオドを使う。国際規格ではこちらの流儀に合わせると聞いたことがある。

irb(main):005:0> 500.localize(:ja).to_currency.to_s(:currency => 'EUR')
=> "€500.00"

500を日本語にローカライズして、ユーロ表示で通貨として文字列化した、ということだと思う。

irb(main):006:0> 500.localize(:ja).to_currency.to_s(:currency => 'eur')
=> "eur500.00"
irb(main):007:0> 500.localize(:ja).to_currency.to_s(:currency => 'JPY')
=> "¥500"

eurが小文字だと、ユーロとは認識されない。JPYは日本円。

irb(main):008:0> TwitterCldr::Shared::Currencies.currency_codes
=> ["ADP", "AED", "AFA", "AFN", "ALK", "ALL", "AMD", "ANG", "AOA", "AOK", "AON", "AOR", "ARA", "ARL", "ARM", "ARP", "ARS", "ATS", "AUD", "AWG", "AZM", "AZN", "BAD", "BAM", "BAN", "BBD", "BDT", "BEC", "BEF", "BEL", "BGL", "BGM", "BGN", "BGO", "BHD", "BIF", "BMD", "BND", "BOB", "BOL", "BOP", "BOV", "BRB", "BRC", "BRE", "BRL", "BRN", "BRR", "BRZ", "BSD", "BTN", "BUK", "BWP", "BYB", "BYR", "BZD", "CAD", "CDF", "CHE", "CHF", "CHW", "CLE", "CLF", "CLP", "CNX", "CNY", "COP", "COU", "CRC", "CSD", "CSK", "CUC", "CUP", "CVE", "CYP", "CZK", "DDM", "DEM", "DJF", "DKK", "DOP", "DZD", "ECS", "ECV", "EEK", "EGP", "ERN", "ESA", "ESB", "ESP", "ETB", "EUR", "FIM", "FJD", "FKP", "FRF", "GBP", "GEK", "GEL", "GHC", "GHS", "GIP", "GMD", "GNF", "GNS", "GQE", "GRD", "GTQ", "GWE", "GWP", "GYD", "HKD", "HNL", "HRD", "HRK", "HTG", "HUF", "IDR", "IEP", "ILP", "ILR", "ILS", "INR", "IQD", "IRR", "ISJ", "ISK", "ITL", "JMD", "JOD", "JPY", "KES", "KGS", "KHR", "KMF", "KPW", "KRH", "KRO", "KRW", "KWD", "KYD", "KZT", "LAK", "LBP", "LKR", "LRD", "LSL", "LTL", "LTT", "LUC", "LUF", "LUL", "LVL", "LVR", "LYD", "MAD", "MAF", "MCF", "MDC", "MDL", "MGA", "MGF", "MKD", "MKN", "MLF", "MMK", "MNT", "MOP", "MRO", "MTL", "MTP", "MUR", "MVP", "MVR", "MWK", "MXN", "MXP", "MXV", "MYR", "MZE", "MZM", "MZN", "NAD", "NGN", "NIC", "NIO", "NLG", "NOK", "NPR", "NZD", "OMR", "PAB", "PEI", "PEN", "PES", "PGK", "PHP", "PKR", "PLN", "PLZ", "PTE", "PYG", "QAR", "RHD", "ROL", "RON", "RSD", "RUB", "RUR", "RWF", "SAR", "SBD", "SCR", "SDD", "SDG", "SDP", "SEK", "SGD", "SHP", "SIT", "SKK", "SLL", "SOS", "SRD", "SRG", "SSP", "STD", "SUR", "SVC", "SYP", "SZL", "THB", "TJR", "TJS", "TMM", "TMT", "TND", "TOP", "TPE", "TRL", "TRY", "TTD", "TWD", "TZS", "UAH", "UAK", "UGS", "UGX", "USD", "USN", "USS", "UYI", "UYP", "UYU", "UZS", "VEB", "VEF", "VND", "VNN", "VUV", "WST", "XAF", "XAG", "XAU", "XBA", "XBB", "XBC", "XBD", "XCD", "XDR", "XEU", "XFO", "XFU", "XOF", "XPD", "XPF", "XPT", "XRE", "XSU", "XTS", "XUA", "XXX", "YDD", "YER", "YUD", "YUM", "YUN", "YUR", "ZAL", "ZAR", "ZMK", "ZMW", "ZRN", "ZRZ", "ZWD", "ZWL", "ZWR"]

通貨コードの一覧。

irb(main):009:0> 1999.localize.to_short_decimal.to_s
=> "2K"
irb(main):010:0> 1999.localize.to_long_decimal.to_s
=> "2 thousand"

数字を上記のように書く。

irb(main):011:0> 1999.localize(:ja).to_long_decimal.to_s
=> "2千"
irb(main):012:0> 1999.localize(:ja).to_short_decimal.to_s
=> "2千"

日本語の場合、short_decimal も long_decimal も同じ。

irb(main):013:0> 1999.localize(:ja).spellout
=> "千九百九十九"

スペルアウト。

irb(main):014:0> 1999.localize(:ja).rbnf.group_names
=> ["SpelloutRules", "OrdinalRules"]

ルール一覧。

irb(main):015:0> 1999.localize(:ja).to_rbnf_s('SpelloutRules', 'spellout-ordinal')
=> "第千九百九十九"

序数。

irb(main):016:0> 1999.localize(:ja).to_rbnf_s('OrdinalRules', 'digits-ordinal')
=> "第1999"

数値をそのまま残すバージョンの序数。

irb(main):017:0> require 'time'
=> false
irb(main):018:0> DateTime.now.localize(:ja).to_full_s
=> "2014年6月4日水曜日 5時13分00秒 UTC +00:00"

日付。time は twitter_cldr に依存してかロードされていたらしい。

irb(main):019:0> (DateTime.now + 0.5).localize(:ja).until.to_s
=> "12時間後"

相対時間。

irb(main):020:0> ['山', '川', '海'].localize(:ja).to_sentence
=> "山、川、海"

列挙。"山、川及び海" にはならない。

irb(main):021:0> TwitterCldr::Formatters::Plurals::Rules.all_for(:ja) 
=> [:other]

複数形のルール。

irb(main):022:0> :ja.localize(:ja).as_language_code
=> "日本語"
irb(main):023:0> :ja.localize(:de).as_language_code
=> "Japanisch"

言語。

irb(main):024:0> TwitterCldr::Shared::PostalCodes.for_territory(:de).valid?(30159)
TypeError: no implicit conversion of Fixnum into String
	from /usr/local/Cellar/ruby/2.1.1_1/lib/ruby/gems/2.1.0/gems/twitter_cldr-3.0.3/lib/twitter_cldr/shared/postal_codes.rb:55:in `=~'
	from /usr/local/Cellar/ruby/2.1.1_1/lib/ruby/gems/2.1.0/gems/twitter_cldr-3.0.3/lib/twitter_cldr/shared/postal_codes.rb:55:in `valid?'
	from (irb):24
	from /usr/local/bin/irb:11:in `<main>'
irb(main):025:0> TwitterCldr::Shared::PostalCodes.for_territory(:de).valid?("30159")
=> true

郵便番号。valid? の引数は、数値ではなくて文字列でなければならない。

irb(main):026:0> TwitterCldr::Shared::PhoneCodes.territories.include?(:ja)
=> false
irb(main):027:0> TwitterCldr::Shared::PhoneCodes.territories
=> [:ac, :ad, :ae, :af, :ag, :ai, :al, :am, :an, :ao, :aq, :ar, :as, :at, :au, :aw, :ax, :az, :ba, :bb, :bd, :be, :bf, :bg, :bh, :bi, :bj, :bl, :bm, :bn, :bo, :br, :bs, :bt, :bw, :by, :bz, :ca, :cc, :cd, :cf, :cg, :ch, :ci, :ck, :cl, :cm, :cn, :co, :cr, :cu, :cv, :cx, :cy, :cz, :de, :dj, :dk, :dm, :do, :dz, :ec, :ee, :eg, :er, :es, :et, :fi, :fj, :fk, :fm, :fo, :fr, :ga, :gb, :gd, :ge, :gf, :gg, :gh, :gi, :gl, :gm, :gn, :gp, :gq, :gr, :gt, :gu, :gw, :gy, :hk, :hn, :hr, :ht, :hu, :id, :ie, :il, :im, :in, :io, :iq, :ir, :is, :it, :je, :jm, :jo, :jp, :ke, :kg, :kh, :ki, :km, :kn, :kp, :kr, :kw, :ky, :kz, :la, :lb, :lc, :li, :lk, :lr, :ls, :lt, :lu, :lv, :ly, :ma, :mc, :md, :me, :mg, :mh, :mk, :ml, :mm, :mn, :mo, :mp, :mq, :mr, :ms, :mt, :mu, :mv, :mw, :mx, :my, :mz, :na, :nc, :ne, :nf, :ng, :ni, :nl, :no, :np, :nr, :nu, :nz, :om, :pa, :pe, :pf, :pg, :ph, :pk, :pl, :pm, :pr, :ps, :pt, :pw, :py, :qa, :re, :ro, :rs, :ru, :rw, :sa, :sb, :sc, :sd, :se, :sg, :sh, :si, :sj, :sk, :sl, :sm, :sn, :so, :sr, :ss, :st, :sv, :sy, :sz, :tc, :td, :tf, :tg, :th, :tj, :tk, :tl, :tm, :tn, :to, :tr, :tt, :tv, :tw, :tz, :ua, :ug, :us, :uy, :uz, :va, :vc, :ve, :vg, :vi, :vn, :vu, :wf, :ws, :ye, :yt, :za, :zm, :zw]
irb(main):028:0> TwitterCldr::Shared::PhoneCodes.territories.include?(:jp)
=> true

電話番号。電話番号は言語ではなくて国に対応するものなので、国コードで指定しなければならなかった。

irb(main):029:0> TwitterCldr::Utils::CodePoints.from_string("😁")
=> [128513]

Unicode コードポイント。

irb(main):030:0> ['いしば', 'いしはら'].sort
=> ["いしはら", "いしば"]
irb(main):031:0> ['いしば', 'いしはら'].localize(:ja).sort
=> #<TwitterCldr::Localized::LocalizedArray:0x007f9247280ec0 @base_obj=["いしば", "いしはら"], @locale=:en>
irb(main):032:0> ['いしば', 'いしはら'].localize(:ja).sort.to_a
=> ["いしば", "いしはら"]

ソート。いしばいしはら課題 - Relevant, Timely, and Accurate も参考。
時間制約もあり無愛想になってしまったが、とりあえず以上。