Class: Unisec::Surrogates
- Inherits:
-
Object
- Object
- Unisec::Surrogates
- Defined in:
- lib/unisec/surrogates.rb
Overview
UTF-16 surrogates conversion.
Instance Attribute Summary collapse
-
#cp ⇒ Integer
readonly
Unicode code point.
-
#hs ⇒ Integer
readonly
High surrogate (1st code unit of a surrogate pair).
-
#ls ⇒ Integer
readonly
Low surrogate (2nd code unit of a surrogate pair).
Class Method Summary collapse
-
.code_point(hs, ls) ⇒ Integer
Calculate the Unicode code point based on the surrogates.
-
.high_surrogate(codepoint) ⇒ Integer
Calculate the high surrogate based on the Unicode code point.
-
.low_surrogate(codepoint) ⇒ Integer
Calculate the low surrogate based on the Unicode code point.
Instance Method Summary collapse
-
#code_point ⇒ Integer
Same as accessing #cp.
-
#display ⇒ String
Display a CLI-friendly output summurizing everithing about the surrogates: the corresponding character, code point, high and low surrogates (each displayed as hexadecimal, decimal and binary).
-
#high_surrogate ⇒ Integer
Same as accessing #hs.
-
#initialize(*args) ⇒ Surrogates
constructor
Init the surrogate pair.
-
#low_surrogate ⇒ Integer
Same as accessing #ls.
Constructor Details
#initialize(*args) ⇒ Surrogates
Init the surrogate pair.
34 35 36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/unisec/surrogates.rb', line 34 def initialize(*args) if args.size == 1 @cp = args[0] @hs = high_surrogate @ls = low_surrogate elsif args.size == 2 @hs = args[0] @ls = args[1] @cp = code_point else raise ArgumentError end end |
Instance Attribute Details
#cp ⇒ Integer (readonly)
Unicode code point
11 12 13 |
# File 'lib/unisec/surrogates.rb', line 11 def cp @cp end |
#hs ⇒ Integer (readonly)
High surrogate (1st code unit of a surrogate pair). Also called lead surrogate.
15 16 17 |
# File 'lib/unisec/surrogates.rb', line 15 def hs @hs end |
#ls ⇒ Integer (readonly)
Low surrogate (2nd code unit of a surrogate pair). Also called trail surrogate.
19 20 21 |
# File 'lib/unisec/surrogates.rb', line 19 def ls @ls end |
Class Method Details
.code_point(hs, ls) ⇒ Integer
Calculate the Unicode code point based on the surrogates.
72 73 74 |
# File 'lib/unisec/surrogates.rb', line 72 def self.code_point(hs, ls) (((hs - 0xd800) * 0x400) + ls - 0xdc00 + 0x10000) end |
.high_surrogate(codepoint) ⇒ Integer
Calculate the high surrogate based on the Unicode code point.
53 54 55 |
# File 'lib/unisec/surrogates.rb', line 53 def self.high_surrogate(codepoint) (((codepoint - 0x10000) / 0x400).floor + 0xd800) end |
.low_surrogate(codepoint) ⇒ Integer
Calculate the low surrogate based on the Unicode code point.
62 63 64 |
# File 'lib/unisec/surrogates.rb', line 62 def self.low_surrogate(codepoint) (((codepoint - 0x10000) % 0x400) + 0xdc00) end |
Instance Method Details
#code_point ⇒ Integer
Same as accessing #cp. Calculate the code_point.
98 99 100 |
# File 'lib/unisec/surrogates.rb', line 98 def code_point @cp = Surrogates.code_point(@hs, @ls) end |
#display ⇒ String
Display a CLI-friendly output summurizing everithing about the surrogates: the corresponding character, code point, high and low surrogates (each displayed as hexadecimal, decimal and binary).
113 114 115 116 117 118 |
# File 'lib/unisec/surrogates.rb', line 113 def display "Char: #{[@cp].pack('U*')}\n" \ "Code Point: 0x#{@cp.to_hex}, 0d#{@cp}, 0b#{@cp.to_bin}\n" \ "High Surrogate: 0x#{@hs.to_hex}, 0d#{@hs}, 0b#{@hs.to_bin}\n" \ "Low Surrogate: 0x#{@ls.to_hex}, 0d#{@ls}, 0b#{@ls.to_bin}" end |
#high_surrogate ⇒ Integer
Same as accessing #hs. Calculate the high_surrogate.
81 82 83 |
# File 'lib/unisec/surrogates.rb', line 81 def high_surrogate @hs = Surrogates.high_surrogate(@cp) end |
#low_surrogate ⇒ Integer
Same as accessing #ls. Calculate the low_surrogate.
90 91 92 |
# File 'lib/unisec/surrogates.rb', line 90 def low_surrogate @ls = Surrogates.low_surrogate(@cp) end |