Represent astral Unicode characters in Javascript, HTML or Ruby

Posted . Visible to the public.

Here is a symbol of an eight note: ♪

Its two-byte hex representation is 0x266A Show archive.org snapshot .

This card describes how to create a string with this symbol in various languages.

All languages

Since our tool chain (editors, languages, databases, browsers) is UTF-8 aware (or at least doesn't mangle bytes), you can usually get away with just pasting the symbol verbatim:

note = '♪'

This is great for shapes that are easily recognized by your fellow programmers.
It's not so great for non-distinct shapes like non-breaking spaces or hyphens of various lengths. For this, use the ways below.

Javascript, Coffeescript

Use the \u escape sequence:

note = '\u266A'

You can use Unicode escape sequences in both single and double quotes.

HTML

Use an entity:

♪

Ruby 1.9+

In modern Rubies you can use the \u escape sequence:

note = "\u266A"

Note that you must use double quotes. Unicode escape sequences are not parsed in a string with single quotes.

Ruby 1.8.7

There are no \u escape sequences in Ruby 1.8.7.

You either need to paste the '♪' symbol verbatim, or you can create a helper method that creates a string from an UTF-16 hex sequence:

class String
  def self.from_utf16_hex(sequence)
    parts = sequence.scan(/..../)
    parts = parts.map { |part| part.to_i(16) }
    parts.pack('U*')  
  end
end  

This lets you say:

String.from_utf16_hex('266a') # => '♪'
String.from_utf16_hex('266a266a') # => '♪♪'

This also works in modern Ruby versions.

Henning Koch
Last edit
Dominik Schöler
License
Source code in this card is licensed under the MIT License.
Posted by Henning Koch to makandra dev (2016-07-28 09:11)