Perturb.org - interesting technology related things from around the internet

Showing entries with tag "Unicode".

Found 2 entries

Perl: Several ways to generate Unicode2024-10-21

Once you find a Unicode code point you can put it into a Perl string in several ways:

my $thumbs_up = "";

$thumbs_up = "\x{1F44D}";
$thumbs_up = "\N{U+1F44D}";
$thumbs_up = chr(0x1F44D);
$thumbs_up = pack("U", 0x1F44D);

# Make sure STDOUT is set to accept UTF8
binmode(STDOUT, ":utf8");

print $thumbs_up x 2 . "\n";

Tags:

Perl

Unicode

Total possible glyphs using UTF-82017-08-14

UTF-8 is an encoding method for representing large amount of glyphs. UTF-8 will use one, two, three, or four bytes to encode a given glyph depending on the given code point needed. Wikipedia has a good table that explains how UTF-8 breaks out:

Number of bytes	Code point bits	First code point	Last code point	Byte 1	Byte 2	Byte 3	Byte 4
1	7	U+0000	U+007F	0xxxxxxx
2	11	U+0080	U+07FF	110xxxxx	10xxxxxx
3	16	U+0800	U+FFFF	1110xxxx	10xxxxxx	10xxxxxx
4	21	U+10000	U+10FFFF	11110xxx	10xxxxxx	10xxxxxx	10xxxxxx

There are 1,114,112 (17 x 2^16) total code points available. BableStone reports that 276,337 (approximately 24.8%) code points are in use, which leaves 837,775 still available. That's a lot of room left for emojis.

Tags:

Unicode

UTF