Class: Unisec::Normalization

Inherits:
Object
  • Object
show all
Defined in:
lib/unisec/normalization.rb

Overview

Normalization Forms

Constant Summary collapse

HTML_ESCAPE_BYPASS =

HTML escapable characters mapped with their Unicode counterparts that will cast to themself after applying normalization forms using compatibility mode.

{
  '<' => ['', ''],
  '>' => ['', ''],
  '"' => [''],
  "'" => [''],
  '&' => ['', '']
}.freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(str) ⇒ nil

Generate all normilzation forms for a given input

Parameters:

  • str (String)

    the target string



41
42
43
44
45
46
47
# File 'lib/unisec/normalization.rb', line 41

def initialize(str)
  @original = str
  @nfc = Normalization.nfc(str)
  @nfkc = Normalization.nfkc(str)
  @nfd = Normalization.nfd(str)
  @nfkd = Normalization.nfkd(str)
end

Instance Attribute Details

#nfcString (readonly)

Normalization Form C (NFC) - Canonical Decomposition, followed by Canonical Composition

Returns:

  • (String)

    input normalized with NFC



24
25
26
# File 'lib/unisec/normalization.rb', line 24

def nfc
  @nfc
end

#nfdString (readonly)

Normalization Form D (NFD) - Canonical Decomposition

Returns:

  • (String)

    input normalized with NFD



32
33
34
# File 'lib/unisec/normalization.rb', line 32

def nfd
  @nfd
end

#nfkcString (readonly)

Normalization Form KC (NFKC) - Compatibility Decomposition, followed by Canonical Composition

Returns:

  • (String)

    input normalized with NFKC



28
29
30
# File 'lib/unisec/normalization.rb', line 28

def nfkc
  @nfkc
end

#nfkdString (readonly)

Normalization Form KD (NFKD) - Compatibility Decomposition

Returns:

  • (String)

    input normalized with NFKD



36
37
38
# File 'lib/unisec/normalization.rb', line 36

def nfkd
  @nfkd
end

#originalString (readonly)

Original input

Returns:

  • (String)

    untouched input



20
21
22
# File 'lib/unisec/normalization.rb', line 20

def original
  @original
end

Class Method Details

.nfc(str) ⇒ String

Normalization Form C (NFC) - Canonical Decomposition, followed by Canonical Composition

Parameters:

  • str (String)

    the target string

Returns:

  • (String)

    input normalized with NFC



52
53
54
# File 'lib/unisec/normalization.rb', line 52

def self.nfc(str)
  str.unicode_normalize(:nfc)
end

.nfd(str) ⇒ String

Normalization Form D (NFD) - Canonical Decomposition

Parameters:

  • str (String)

    the target string

Returns:

  • (String)

    input normalized with NFD



66
67
68
# File 'lib/unisec/normalization.rb', line 66

def self.nfd(str)
  str.unicode_normalize(:nfd)
end

.nfkc(str) ⇒ String

Normalization Form KC (NFKC) - Compatibility Decomposition, followed by Canonical Composition

Parameters:

  • str (String)

    the target string

Returns:

  • (String)

    input normalized with NFKC



59
60
61
# File 'lib/unisec/normalization.rb', line 59

def self.nfkc(str)
  str.unicode_normalize(:nfkc)
end

.nfkd(str) ⇒ String

Normalization Form KD (NFKD) - Compatibility Decomposition

Parameters:

  • str (String)

    the target string

Returns:

  • (String)

    input normalized with NFKD



73
74
75
# File 'lib/unisec/normalization.rb', line 73

def self.nfkd(str)
  str.unicode_normalize(:nfkd)
end

.replace_bypass(str) ⇒ String

Replace HTML escapable characters with their Unicode counterparts that will cast to themself after applying normalization forms using compatibility mode. Usefull for XSS, to bypass HTML escape. If several values are possible, one is picked randomly.

Parameters:

  • str (String)

    the target string

Returns:

  • (String)

    escaped input



83
84
85
86
87
88
89
# File 'lib/unisec/normalization.rb', line 83

def self.replace_bypass(str)
  str = str.dup
  HTML_ESCAPE_BYPASS.each do |k, v|
    str.gsub!(k, v.sample)
  end
  str
end

Instance Method Details

#displayString

Display a CLI-friendly output summurizing all normalization forms

Examples:

puts Unisec::Normalization.new("\u{1E9B 0323}").display
# =>
# Original: ẛ̣
#   U+1E9B U+0323
# NFC: ẛ̣
#   U+1E9B U+0323
# NFKC: ṩ
#   U+1E69
# NFD: ẛ̣
#   U+017F U+0323 U+0307
# NFKD: ṩ
#   U+0073 U+0323 U+0307

Returns:

  • (String)

    CLI-ready output



111
112
113
114
115
116
117
118
119
120
121
# File 'lib/unisec/normalization.rb', line 111

def display
  colorize = lambda { |form_title, form_attr|
    "#{Paint[form_title.to_s, :underline,
             :bold]}: #{form_attr}\n  #{Paint[Unisec::Properties.chars2codepoints(form_attr), :red]}\n"
  }
  colorize.call('Original', @original) +
    colorize.call('NFC', @nfc) +
    colorize.call('NFKC', @nfkc) +
    colorize.call('NFD', @nfd) +
    colorize.call('NFKD', @nfkd)
end

#display_replaceObject

Display a CLI-friendly output of the XSS payload to bypass HTML escape and what it does once normalized in NFKC & NFKD.



125
126
127
128
129
130
131
132
133
134
135
# File 'lib/unisec/normalization.rb', line 125

def display_replace
  colorize = lambda { |form_title, form_attr|
    "#{Paint[form_title.to_s, :underline,
             :bold]}: #{form_attr}\n  #{Paint[Unisec::Properties.chars2codepoints(form_attr), :red]}\n"
  }
  payload = replace_bypass
  colorize.call('Original', @original) +
    colorize.call('Bypass payload', payload) +
    colorize.call('NFKC', Normalization.nfkc(payload)) +
    colorize.call('NFKD', Normalization.nfkd(payload))
end

#replace_bypassObject

Instance version of replace_bypass.



92
93
94
# File 'lib/unisec/normalization.rb', line 92

def replace_bypass
  Normalization.replace_bypass(@original)
end