Manually editing ruby on Chinese characters in InDesign

WARNING: The following is exceedingly geeky, but I’m posting it here so that six months from now, when I’ve utterly forgotten how I did this, I can look it up. And who knows? Maybe someone else will want to know how to do this, too. Or will want to tell me I’ve been doing it all wrong.

Although the Chinese-owned publishing company where I now am managing editor mostly produces English-only books, occasionally I do have to deal with Chinese characters in InDesign. This week, I started working on a series of dual-language poetry books that were previously published in China, and I want to rework these into a single parallel-text edition for U.S. readers, particularly students. Fortunately, the Chinese edition was set in InDesign—this is not always the case—so I’m able to rework the files we received from the original publisher.

Immediately I set about editing the style sheets—changing their names to ones I could actually read, for instance—and rejiggering things to enable matching the originals and translations across spreads, stanza by stanza if not line by line. In particular, I wanted to make the Chinese text slightly larger and the English text slightly smaller. I also wanted to replace all the English fonts with more tasteful ones and simplify the palette of Chinese fonts.

The Chinese versions of the poems in these books all have little pinyin pronunciation guides over each character, and at first I thought they were part of the fonts themselves. To my delight, however, I found that when I changed the Chinese fonts, the pronunciation guides stayed in place and scaled with the text. These pronunciation guides are, it turns out, called ruby annotations, and if you view the Chinese in InDesign’s built-in text editor, you can see the ruby label wrapping each character. (Don’t worry; a screenshot’s coming up.)

Unfortunately, the original book used a font that didn’t contain all the characters needed to set these poems. I don’t know much about Chinese fonts, but from my experience with Chinese-language files thus far, this seems like a common issue. There are several Chinese character encodings, and sometimes the glyph you need will not appear in the font you’re using. When this happens, sometimes the compositor will swap in a different font just for that character, or sometimes, as they did in these books, they will insert the missing character as one or more pieces of outlined type, grouped and anchored inline:

an InDesign layout having an embedded outlined character

I assume this is done using some plug-in, or that it’s a thing the CJK version of InDesign handles automatically, but in any case, the ruby was somehow then wrapped around this anchored image. So when I changed the Chinese font and made it larger, these anchored images didn’t scale. Their ruby text did scale, but it was no longer aligned properly. I could scale the embedded graphic manually and move the ruby up using baseline shift, but nothing really lined up, the ruby was off-center, and the style of the glyph didn’t match the surrounding text, of course. Furthermore, I thought these missing characters might exist in the new font I’d chosen, and if that was the case, I’d far rather replace the embedded images with live text. If I just pasted in a new character, however, I couldn’t add a ruby label to it, nor could I move the new character inside the existing ruby tags to replace the anchored image, although I could see these in InDesign’s text editor view:

How ruby appears in InDesign's text editor view

So I poked around in the InDesign forums and found a post by David W. Goodrich recommending World Tools Pro ($179), a post by Manish Sharma explaining how to launch the InDesign in CJK mode, and a post by John Hawkinson saying you can edit ruby if you pull the text out as a snippet. In the latter thread, Jongware posted a “Poor Man’s Ruby Editor,” but I didn’t try this; instead, I decided to see if I could figure out the snippet method.

First, I had to look up how to create a snippet, since I’d never done so before except by accident. But eventually, I came up with the following process.

1. Figure out what the Chinese character should be.
I can’t read, write, or speak one word of Chinese. I have no idea how anyone ever manages to type it on a U.S. keyboard layout. But I figured out two ways to identify these missing characters.

(a) Search for the pinyin transliteration using the pronunciation lookup at ctext.org

Looking up a character by its pinyin transliteration, at ctext.org

and then visually scan the search results until I find my character:

Locating the character I wish to replace, at ctext.org

When I click on the correct character, I get a page of handy information about it, including any text on ctext.org that includes it.

The ctext.org information page for the character I'm replacing

In this case, I found text that (almost) matches my original, confirming that I’ve got the right one.

Alternatively, since my text is a classic and I’m pretty sure it’s online somewhere, I could just (b) search Google or ctext.org for a line of text above or below the one I’m working on, and find the missing character that way. This method is faster, and I found that in at least one case, the outlined character in the original layout appeared to be the wrong one, or the ruby label was wrong. (Yes, I’m going to have someone who actually knows Chinese proofread these pages, when I’m done; I’m crazy but not stupid. Or stupid but not crazy. One of those . . .)

2. Paste the correct character into the InDesign document next to the original one.
It pastes in at the right size and in the right font, and, sure enough the new font I’m using actually includes this character. Whee! However, the ruby label is missing.

The live Chinese character pasted into InDesign next to the outlined, ruby-labeled version

3. Select both the Chinese character and the ruby-annotated graphic and Export -> Adobe InDesign Tagged Text.

InDesign's Tagged Text Export Options dialog

I use the Abbreviated option, since I don’t need most of the gunk that will be in this file, and ASCII mode, since exporting as Unicode got me weird results when I tried it.

4. Open the resulting .txt file in a text editor. I use BBEdit, and I set it to display as XML, to add some color coding.

A tagged text snipped exported from InDesign

It looks quite formidable, but all I’m really concerned with is the last line. Here are the business parts of that line:

Labeled detail of the last line of the exported snippet

In fact, you can delete everything above this last line except the <ASCII-MAC>, to make it easier to see what you’re doing. The rest of the snippet code is mostly style definitions, which we don’t need since we’ll be pasting this snippet right back into the same document it came from.

5. Cut and paste the unicode for the Chinese character (item 3) over the space where the embedded image was in the original ruby wrapper (item 2). You can delete the trailing <cr:><crstr:>, so my entire revised snippet file is now just

My edited version of the exported snippet

6. Save the snippet and drag the file back onto your InDesign window. Click-drag to place the snippet.

My new snippet placed in the InDesign document

Note how the ruby annotation sits completely above the top of the text box; it’s welded to that glyph, though, I swear.

7. Copy and paste the new rubyfied character into your text, replacing the old image and un-rubied character.

The new ruby-annotated character in place in the text

Now, how do you say, “Et voilà!” in Chinese?

It looks like a lot of trouble, but I can now round-trip a character in less than a minute, and fortunately there aren’t very many of these. Still, if there’s some easier way to do this, I’m all ears.

3 thoughts on “Manually editing ruby on Chinese characters in InDesign

  1. If you are familiar with LaTex, typesetting a Chinese-character book requires no additional skills. The roman font which accompanies the Chinese character font looks very thin. I don’t know what this font is called, but it’s ubiquitous–it seems to accompany different Chinese fonts.

  2. I just stumbled on this post. A good friend of mine, Diane Burns, owns a company that specializes in CJK and other non-Latin language publishing out of InDesign. I’ll send her the URL to this to see if she has any insight. (She also has a title on lynda.com all about multi-lingual publishing from InDesign.)

  3. FYI – It’s easy to type in Chinese characters using a U.S. keyboard. Your screenshots show a Mac operating system. So, for you, it would involve going to [Apple] > [System Preferences] > [Keyboard] > [Input Sources], then press “+” and then go to either Chinese Traditional or Chinese Simplified, and select either Pinyin Traditional or Pinyin Simplified to add it to your input sources. Then check the box “Show Input menu in menu bar.” You will see a flag icon appear in the top bar. By default, it will probably be a U.S. flag (if you’re in the U.S., for example). To enable writing in Pinyin, you’d click the flag, select Pinyin (either simplified or traditional) from the dropdown menu and then type. What will happen is that you will type in the Pinyin (what you refer to as the pronunciation key), and then a list of characters will show up. You can either click or press the number of the character on the list, and then it will enter. Click on the flag again and put it back to the U.S. (or your country’s flag) in order to go back to normal typing.

    I just self-printed a book that featured Chinese characters in it and this made the process a lot easier.

    Also important: Sometimes Chinese characters have more than one pronunciation! Take a look at this dictionary entry for a Chinese character, and pay special attention to the direction of the accent marks on the entries (they’re not all the same): https://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=1&wdqb=? (Note that the last two slightly different characters are a variant form of the character.) This is something which makes romanization difficult — you really need to know what the meaning of the character is in the specific context, in order to know what the character’s romanization will be.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.