Sunday, 14 October 2012

bug or feature ?


Three pages of kanji definitions have the first meaning repeated, but the second lost.

三 4E09, 23, three, three ; %e4%b8%89
上 4E0A, 37, above, above, up ; %e4%b8%8a
下 4E0B, 7, below, below, down, descend, give, low, inferior ; %e4%b8%8b
不 4E0D, 572, negative, negative, non-, bad, ugly, clumsy ; %e4%b8%8d
 
Yet somehow the first meaning is more eye-catching for me. An unintended feature ?
 
Error in HTML generation script after XML parse of kanjidic2




Thursday, 11 October 2012

kanji by UCS

Here are some the the first, basic Japanese kanji sorted by their UCS value (UTF-16 codepoint)


一  丁  七  万  丈  三  上  下  不  与  世  丘  丙  両  並  中  丸  丹  主  久  乏  乗  乙  九  乱  乳  乾  事  二  亜  享  京  亭  人  仁  今

Is a particular pattern evident or helpful ?

I am in the process of adding a UCS - to- kanji -to- urlencoded - utf-8 page over at kanji.aule-browser.com which uses the Curl web content language (only a few lines of declarative script and a wee bit of procedural script required.)

UPDATE : that page with the HTML urlencoding for each character is at http://www.aule-browser.com/kanji/henshall-sorted-urlencoded.html .

A simpler plain HTML page is http://www.aule-browser.com/kanji/henshall-sorted-by-unicode.html .

Another safe, plain HTML page with no scripts, images, ads or other nuisance has the 1,945 Hernshall basic Japanese kanji sorted as they appear in the book - by their so-called Henshall number - is at http://www.aule-browser.com/kanji/henshall-sorted-by-id.html.

By viewing the page source in your browser you can see that there are no script or image elements to worry about - so you can safely copy this HTML text to your local machine to edit as you see fit.

The HTML text was generating using a Curl applet running off-line and parsing the Kanjidic2 XML.



Wednesday, 10 October 2012

thread and short thread bushu

Here is a small JPG from a Curl web content page based on results at denshi jisho :


I have web pages with both the Curl and an HTML version - the latter suited to translation plugins such as Perapera for the Firefox browser.

The original BMP at 1280x900 with 24pt font is quite legible. This 'calendar' night carry through the weeks of winter embroidery :