Monday, 1 September 2014

Adobe Reader PDF Text COPY : kanji + furigana copied


漢字 · かんじ

Aule Kanji Pages · Kanji Recog Pages

The image below is of a PDF open in the latest Adobe Reader ; as you see, the text copy selection also grabs the furigana, producing slightly problematic results as plain text.

Web page copies tend to place the furigana in-line, which is even worse!

A typical result from Adobe Reader follows:

夜の 
ちやう
帳にささめき尽きし星の今を 
げかい
下界 の人の鬢のほつれ

Is the last item furigana or not?

Compare my Curl markup example in this blog post. The MIT Curl browser plugin is available at the Tokyo SCSK site.



1 comment:

KanjiRecog said...

Furigana selectability is often discussed in terms of "ruby" characters versus superscript/subscript characters.
HTML5 + CSS3 offers more options where fully supported.
SCSK Curl 8.x does not yet have ruby or SVG support.