Showing posts with label UTF-16. Show all posts
Showing posts with label UTF-16. Show all posts

Monday, 31 December 2012

Henshall Japanese kanji (3 pages, revised)


I have revised the layout of several pages of the complete Henshall kanji set of 1,945.
  1. Sorted by UCS with url encoded value
  2. Sorted by Henshall book entry number
  3. Sorted by Unicode UTF-16
A typical row from the last would be

  4F8B   605  example  ·  custom, usage, precedent




Thursday, 11 October 2012

kanji by UCS

Here are some the the first, basic Japanese kanji sorted by their UCS value (UTF-16 codepoint)


一  丁  七  万  丈  三  上  下  不  与  世  丘  丙  両  並  中  丸  丹  主  久  乏  乗  乙  九  乱  乳  乾  事  二  亜  享  京  亭  人  仁  今

Is a particular pattern evident or helpful ?

I am in the process of adding a UCS - to- kanji -to- urlencoded - utf-8 page over at kanji.aule-browser.com which uses the Curl web content language (only a few lines of declarative script and a wee bit of procedural script required.)

UPDATE : that page with the HTML urlencoding for each character is at http://www.aule-browser.com/kanji/henshall-sorted-urlencoded.html .

A simpler plain HTML page is http://www.aule-browser.com/kanji/henshall-sorted-by-unicode.html .

Another safe, plain HTML page with no scripts, images, ads or other nuisance has the 1,945 Hernshall basic Japanese kanji sorted as they appear in the book - by their so-called Henshall number - is at http://www.aule-browser.com/kanji/henshall-sorted-by-id.html.

By viewing the page source in your browser you can see that there are no script or image elements to worry about - so you can safely copy this HTML text to your local machine to edit as you see fit.

The HTML text was generating using a Curl applet running off-line and parsing the Kanjidic2 XML.