ICUのRuby実装twitter-cldr-rbの動作確認をする
i18n 界隈では、昔から ICU - International Components for Unicode が知られている。最近、このRuby実装である twitter/twitter-cldr-rb · GitHub を知ったので、このGithubページのREADMEにある Usage をなぞるだけであるが手元の OS X で動作確認してみた。
導入
$ gem install twitter_cldr Fetching: camertron-eprun-1.1.0.gem (100%) Successfully installed camertron-eprun-1.1.0 Fetching: twitter_cldr-3.0.3.gem (100%) Successfully installed twitter_cldr-3.0.3 Parsing documentation for camertron-eprun-1.1.0 Installing ri documentation for camertron-eprun-1.1.0 Parsing documentation for twitter_cldr-3.0.3 Installing ri documentation for twitter_cldr-3.0.3 Done installing documentation for camertron-eprun, twitter_cldr after 8 seconds 2 gems installed
動作確認
irb を使って次のように確認を進めた。
irb(main):001:0> require 'twitter_cldr' => true
問題なく導入されている。
irb(main):002:0> TwitterCldr.supported_locales => [:af, :ar, :be, :bg, :bn, :ca, :cs, :cy, :da, :de, :el, :en, :"en-GB", :es, :eu, :fa, :fi, :fil, :fr, :ga, :gl, :he, :hi, :hr, :hu, :id, :is, :it, :ja, :ko, :lv, :ms, :nb, :nl, :pl, :pt, :ro, :ru, :sk, :sq, :sr, :sv, :ta, :th, :tr, :uk, :ur, :vi, :zh, :"zh-Hant"]
サポートされているロケール。:ja も問題なく含まれている。
irb(main):003:0> 2014.localize(:ja).to_s => "2,014"
日本の三桁区切りの入れ方。カンマを使う。
irb(main):004:0> 2014.localize(:de).to_s => "2.014"
ドイツの三桁区切りの入れ方。ピリオドを使う。国際規格ではこちらの流儀に合わせると聞いたことがある。
irb(main):005:0> 500.localize(:ja).to_currency.to_s(:currency => 'EUR') => "€500.00"
500を日本語にローカライズして、ユーロ表示で通貨として文字列化した、ということだと思う。
irb(main):006:0> 500.localize(:ja).to_currency.to_s(:currency => 'eur') => "eur500.00" irb(main):007:0> 500.localize(:ja).to_currency.to_s(:currency => 'JPY') => "¥500"
eurが小文字だと、ユーロとは認識されない。JPYは日本円。
irb(main):008:0> TwitterCldr::Shared::Currencies.currency_codes => ["ADP", "AED", "AFA", "AFN", "ALK", "ALL", "AMD", "ANG", "AOA", "AOK", "AON", "AOR", "ARA", "ARL", "ARM", "ARP", "ARS", "ATS", "AUD", "AWG", "AZM", "AZN", "BAD", "BAM", "BAN", "BBD", "BDT", "BEC", "BEF", "BEL", "BGL", "BGM", "BGN", "BGO", "BHD", "BIF", "BMD", "BND", "BOB", "BOL", "BOP", "BOV", "BRB", "BRC", "BRE", "BRL", "BRN", "BRR", "BRZ", "BSD", "BTN", "BUK", "BWP", "BYB", "BYR", "BZD", "CAD", "CDF", "CHE", "CHF", "CHW", "CLE", "CLF", "CLP", "CNX", "CNY", "COP", "COU", "CRC", "CSD", "CSK", "CUC", "CUP", "CVE", "CYP", "CZK", "DDM", "DEM", "DJF", "DKK", "DOP", "DZD", "ECS", "ECV", "EEK", "EGP", "ERN", "ESA", "ESB", "ESP", "ETB", "EUR", "FIM", "FJD", "FKP", "FRF", "GBP", "GEK", "GEL", "GHC", "GHS", "GIP", "GMD", "GNF", "GNS", "GQE", "GRD", "GTQ", "GWE", "GWP", "GYD", "HKD", "HNL", "HRD", "HRK", "HTG", "HUF", "IDR", "IEP", "ILP", "ILR", "ILS", "INR", "IQD", "IRR", "ISJ", "ISK", "ITL", "JMD", "JOD", "JPY", "KES", "KGS", "KHR", "KMF", "KPW", "KRH", "KRO", "KRW", "KWD", "KYD", "KZT", "LAK", "LBP", "LKR", "LRD", "LSL", "LTL", "LTT", "LUC", "LUF", "LUL", "LVL", "LVR", "LYD", "MAD", "MAF", "MCF", "MDC", "MDL", "MGA", "MGF", "MKD", "MKN", "MLF", "MMK", "MNT", "MOP", "MRO", "MTL", "MTP", "MUR", "MVP", "MVR", "MWK", "MXN", "MXP", "MXV", "MYR", "MZE", "MZM", "MZN", "NAD", "NGN", "NIC", "NIO", "NLG", "NOK", "NPR", "NZD", "OMR", "PAB", "PEI", "PEN", "PES", "PGK", "PHP", "PKR", "PLN", "PLZ", "PTE", "PYG", "QAR", "RHD", "ROL", "RON", "RSD", "RUB", "RUR", "RWF", "SAR", "SBD", "SCR", "SDD", "SDG", "SDP", "SEK", "SGD", "SHP", "SIT", "SKK", "SLL", "SOS", "SRD", "SRG", "SSP", "STD", "SUR", "SVC", "SYP", "SZL", "THB", "TJR", "TJS", "TMM", "TMT", "TND", "TOP", "TPE", "TRL", "TRY", "TTD", "TWD", "TZS", "UAH", "UAK", "UGS", "UGX", "USD", "USN", "USS", "UYI", "UYP", "UYU", "UZS", "VEB", "VEF", "VND", "VNN", "VUV", "WST", "XAF", "XAG", "XAU", "XBA", "XBB", "XBC", "XBD", "XCD", "XDR", "XEU", "XFO", "XFU", "XOF", "XPD", "XPF", "XPT", "XRE", "XSU", "XTS", "XUA", "XXX", "YDD", "YER", "YUD", "YUM", "YUN", "YUR", "ZAL", "ZAR", "ZMK", "ZMW", "ZRN", "ZRZ", "ZWD", "ZWL", "ZWR"]
通貨コードの一覧。
irb(main):009:0> 1999.localize.to_short_decimal.to_s => "2K" irb(main):010:0> 1999.localize.to_long_decimal.to_s => "2 thousand"
数字を上記のように書く。
irb(main):011:0> 1999.localize(:ja).to_long_decimal.to_s => "2千" irb(main):012:0> 1999.localize(:ja).to_short_decimal.to_s => "2千"
日本語の場合、short_decimal も long_decimal も同じ。
irb(main):013:0> 1999.localize(:ja).spellout => "千九百九十九"
スペルアウト。
irb(main):014:0> 1999.localize(:ja).rbnf.group_names => ["SpelloutRules", "OrdinalRules"]
ルール一覧。
irb(main):015:0> 1999.localize(:ja).to_rbnf_s('SpelloutRules', 'spellout-ordinal') => "第千九百九十九"
序数。
irb(main):016:0> 1999.localize(:ja).to_rbnf_s('OrdinalRules', 'digits-ordinal') => "第1999"
数値をそのまま残すバージョンの序数。
irb(main):017:0> require 'time' => false irb(main):018:0> DateTime.now.localize(:ja).to_full_s => "2014年6月4日水曜日 5時13分00秒 UTC +00:00"
日付。time は twitter_cldr に依存してかロードされていたらしい。
irb(main):019:0> (DateTime.now + 0.5).localize(:ja).until.to_s => "12時間後"
相対時間。
irb(main):020:0> ['山', '川', '海'].localize(:ja).to_sentence => "山、川、海"
列挙。"山、川及び海" にはならない。
irb(main):021:0> TwitterCldr::Formatters::Plurals::Rules.all_for(:ja) => [:other]
複数形のルール。
irb(main):022:0> :ja.localize(:ja).as_language_code => "日本語" irb(main):023:0> :ja.localize(:de).as_language_code => "Japanisch"
言語。
irb(main):024:0> TwitterCldr::Shared::PostalCodes.for_territory(:de).valid?(30159) TypeError: no implicit conversion of Fixnum into String from /usr/local/Cellar/ruby/2.1.1_1/lib/ruby/gems/2.1.0/gems/twitter_cldr-3.0.3/lib/twitter_cldr/shared/postal_codes.rb:55:in `=~' from /usr/local/Cellar/ruby/2.1.1_1/lib/ruby/gems/2.1.0/gems/twitter_cldr-3.0.3/lib/twitter_cldr/shared/postal_codes.rb:55:in `valid?' from (irb):24 from /usr/local/bin/irb:11:in `<main>' irb(main):025:0> TwitterCldr::Shared::PostalCodes.for_territory(:de).valid?("30159") => true
郵便番号。valid? の引数は、数値ではなくて文字列でなければならない。
irb(main):026:0> TwitterCldr::Shared::PhoneCodes.territories.include?(:ja) => false irb(main):027:0> TwitterCldr::Shared::PhoneCodes.territories => [:ac, :ad, :ae, :af, :ag, :ai, :al, :am, :an, :ao, :aq, :ar, :as, :at, :au, :aw, :ax, :az, :ba, :bb, :bd, :be, :bf, :bg, :bh, :bi, :bj, :bl, :bm, :bn, :bo, :br, :bs, :bt, :bw, :by, :bz, :ca, :cc, :cd, :cf, :cg, :ch, :ci, :ck, :cl, :cm, :cn, :co, :cr, :cu, :cv, :cx, :cy, :cz, :de, :dj, :dk, :dm, :do, :dz, :ec, :ee, :eg, :er, :es, :et, :fi, :fj, :fk, :fm, :fo, :fr, :ga, :gb, :gd, :ge, :gf, :gg, :gh, :gi, :gl, :gm, :gn, :gp, :gq, :gr, :gt, :gu, :gw, :gy, :hk, :hn, :hr, :ht, :hu, :id, :ie, :il, :im, :in, :io, :iq, :ir, :is, :it, :je, :jm, :jo, :jp, :ke, :kg, :kh, :ki, :km, :kn, :kp, :kr, :kw, :ky, :kz, :la, :lb, :lc, :li, :lk, :lr, :ls, :lt, :lu, :lv, :ly, :ma, :mc, :md, :me, :mg, :mh, :mk, :ml, :mm, :mn, :mo, :mp, :mq, :mr, :ms, :mt, :mu, :mv, :mw, :mx, :my, :mz, :na, :nc, :ne, :nf, :ng, :ni, :nl, :no, :np, :nr, :nu, :nz, :om, :pa, :pe, :pf, :pg, :ph, :pk, :pl, :pm, :pr, :ps, :pt, :pw, :py, :qa, :re, :ro, :rs, :ru, :rw, :sa, :sb, :sc, :sd, :se, :sg, :sh, :si, :sj, :sk, :sl, :sm, :sn, :so, :sr, :ss, :st, :sv, :sy, :sz, :tc, :td, :tf, :tg, :th, :tj, :tk, :tl, :tm, :tn, :to, :tr, :tt, :tv, :tw, :tz, :ua, :ug, :us, :uy, :uz, :va, :vc, :ve, :vg, :vi, :vn, :vu, :wf, :ws, :ye, :yt, :za, :zm, :zw] irb(main):028:0> TwitterCldr::Shared::PhoneCodes.territories.include?(:jp) => true
電話番号。電話番号は言語ではなくて国に対応するものなので、国コードで指定しなければならなかった。
irb(main):029:0> TwitterCldr::Utils::CodePoints.from_string("😁") => [128513]
Unicode コードポイント。
irb(main):030:0> ['いしば', 'いしはら'].sort => ["いしはら", "いしば"] irb(main):031:0> ['いしば', 'いしはら'].localize(:ja).sort => #<TwitterCldr::Localized::LocalizedArray:0x007f9247280ec0 @base_obj=["いしば", "いしはら"], @locale=:en> irb(main):032:0> ['いしば', 'いしはら'].localize(:ja).sort.to_a => ["いしば", "いしはら"]
ソート。いしばいしはら課題 - Relevant, Timely, and Accurate も参考。
時間制約もあり無愛想になってしまったが、とりあえず以上。