Vim, Unicode, and Digraphs

Page contents

News

2021-February-16  Used Pandoc to convert the source of this article to AsciiDoc, edited the AsciiDoc results of that conversion, and used Hugo and Asciidoctor to generate an updated version of this article.

2020-November-4  As of today, this article, which was originally published elsewhere, has been on the web for 7 years.🎂

Overview

Unicode, in particular UTF-8 encoding, is now the de facto standard for writing on the web. In this article, I describe how I set up Vim to make it easy to write using Unicode characters that aren’t on my keyboard (for example ☹︎, ☺︎, and ♥︎). In a nutshell, I set up Vim to use:

  1. UTF-8 encoding,

  2. a font that includes glyphs for the Unicode characters that I use,

  3. and 2-character keyboard shortcuts (digraphs) that are easy for me to remember (Vim’s default digraphs are hard for me to remember).

Setting Vim’s Encoding

Whether you use terminal Vim or GUI Vim, put the following in your vimrc:

set encoding=utf-8

If you use terminal Vim, you also need to configure your terminal emulator to use UTF-8 encoding. The way to do that depends on your terminal emulator.

Setting Vim’s Font

Choosing a font is a huge topic, but for this article I’ll just say that you want a font that:

  • you like,

  • is monospace (fixed width),

  • includes the Unicode characters that you use,

  • is free/gratis or that you are willing to pay for, and

  • is free/libre (if you care about the FLOSS movement).

The monospace fonts that I use are DejaVu Sans Mono, Monaco, and Consolas. DejaVu Sans Mono is free/gratis and free/libre. Monaco is included on Macs and Consolas is included on Windows. Neither Monaco nor Consolas are free/libre.

The way to tell Vim which font to use depends on whether you are using a GUI Vim, such as gVim or MacVim, or terminal Vim. If you use terminal Vim, specify the font in your terminal emulator’s configuration.

If you use a GUI Vim, put something like the following in your vimrc:

if has("gui_running")
   set guifont=DejaVu_Sans_Mono:h16
endif

Replace DejaVu_Sans_Mono:h16[1] with the font and size you want to use. I recommend putting this in an if-block so you can use this vimrc for both terminal Vims and GUI Vims. If you use the same vimrc across multiple devices, you can specify a list of guifonts like this:

if has("gui_running")
   set guifont=DejaVu_Sans_Mono:h16,Monaco:h16,Consolas:h12
endif

Vim will use the first font that’s available on the current device.

Entering Unicode Characters in Vim

Vim has built-in digraphs, which are standard 2-character keyboard shortcuts for entering Unicode characters that are specified in RFC 1345: Character Mnemonics and Character Sets. To learn about Vim’s digraphs, type the following in Vim in command-mode.

:help digraph.txt
:help digraph-table
:help digraph-table-mbyte
:help dig
:dig

You can also read the above built-in Vim Help files on the web at Vim documentation: digraph.

Here’s an excerpt from help digraph-table:

char  digraph   hex     dec     official name
☺︎       0u      263A    9786    WHITE SMILING FACE
♡       cH      2661    9825    WHITE HEART SUIT

To insert a character using its digraph, do the following:

  1. Move the cursor to the place where you want to insert the character

  2. Be in insert-mode

  3. Type Ctrl+k or Ctrl+K

  4. Type the 2-character sequence that makes up the digraph

For example, to enter a White Smiling Face (☺︎), type the following in insert-mode:

Ctrl-k 0u

To enter a White Heart Suit (♡), type the following in insert-mode:

Ctrl-k cH

Specifying Your Own Digraphs

The default digraphs are hard to remember and almost every time I use one I need to look it up by typing:

:help digraph-table
/string

where string is part of the name of the character that I’m searching for. For example, to find the above characters (☺ and ♡), I search the digraph-table using /face and /heart. To make my life easier, I’ve set my own personal easy-to-remember digraphs in my vimrc. Here are some of my vimrc dig settings:

dig  :(  9785
dig  :)  9786
dig  <3  9829

The syntax, which is described in :help dig, is:

:dig[raphs] {char1}{char2} {number} ...
    Add digraph {char1}{char2} to the list. {number} is
    the decimal representation of the character.

To make it easy to remember what Unicode characters I’ve set up, I include comments in my vimrc like this:

dig  :(  9785    " ☹  White Frowning Face
dig  :)  9786    " ☺  White Smiling Face
dig  <3  9829    " ♥  Black Heart Suit

In a vimrc, text after a double quote mark (") is a comment.

With these settings, I can type the following in insert mode to insert these Unicode characters:

Ctrl-k :(
Ctrl-k :)
Ctrl-k <3

If I forget the digraphs that I’ve set up in my vimrc, I can list all digraphs (default and personal) by typing the following in command-mode:

:dig

At the end of the list are the digraphs I’ve added:

:( ☹  9785    :) ☺  9786    <3 ♥  9829

To quickly get to the end of the list of digraphs, type:

:dig
G

In Vim’s command-mode, G means “Goto last line.”

Finding the Decimal Representation of a Character

To specify a digraph for a character, you need to know the decimal representation (codepoint) of the character. Vim (of course ☺) has commands to find out the decimal, hexadecimal, and octal representations of a character. Here’s how:

  1. Place the cursor over the character

  2. Press the Escape key to be sure you’re in command-mode

  3. Type   ga

Alternatively, for step 3:

  1. Type   :as

For example, with either of the above commands Vim displays the following for the White Smiling Face character:

<☺> 9786, Hex 263a, Octal 23072

To learn more about these commands, see either:

:help ga
:help :as

These two commands are equivalent and are sometimes called the “get ascii” command. You can read the Vim Help file either within Vim or at Vim documentation: various #ga.

💡
If you find a Unicode character that you’d like to create a Vim digraph for, copy & paste it into your vimrc, use the "get ascii" (ga) command to find its decimal representation, and then create your digraph with the appropriate dig command.

 

References

Endnote


1. To specify a font that includes a space in its name in a vimrc, replace the space with either underscore (_) or backslash-space (\ ).

Comments 👍 👎 📝

To comment, you must be signed in to GitHub.