Showing posts with label utf-8. Show all posts
Showing posts with label utf-8. Show all posts

Monday 5 August 2013

Masaoka Shiki New Year haiku 1893-1902


The conversion of the Shiki haiku to UTF-8 is nearly complete with these :


Now to devise a flexible applet which a user can configure to suit their kanji learning needs using the Shiki haiku collections as a resource.


Tuesday 30 July 2013

Shiki spring kigo haiku 1899


Here is a snapshot of the Curl web browser applet with Shiki’s spring 春 はる〕kigo haiku from 1899 in UTF-8 char-encoding.

Above you see the kigo index to the left and the pop-up menu to copy characters which can then be sought using CTRL-f and the Find menu.




Thursday 25 July 2013

Shiki summer kigo haiku 1897 in utf-8


Here is a screenshot of today's Curl applet : 335 summer kigo haiku from 1897 by Masaoka Shiki rendered in utf-8, indexed by kigo,  in Curl web content markup ; this instance is Pale Moon browser with HanaMinA font at 24pt




Wednesday 24 July 2013

Shiki winter kigo haiku 1896


Here is a screenshot of the Curl applet with the Masaoka Shiki winter kigo haiku from 1896 in utf-8 character encoding (Unicode.)


Some 500 haiku converted from SH-JIS encoding and indexed in a Curl browser applet.




TEI Japanese HTML character encoding


Is there any longer a good reason for funded text initiatives on the web NOT to be UTF-8 Unicode ?

TEI at http://etext.lib.virginia.edu/japanese/hyakunin/frames/index/hyaku3euc.html#euc2 has a Japanese poetry page with a click-driven "swiping frames" metaphor - one of which is a view showing the kanji for haiku, waka or tanka in the char-encoding CHARSET=x-euc-jp 

Oi vey!

CHARSET=x-euc-jp ?!? No, not HTML from a server in Kyoto. The server is in VA. In collaboration with academics in PA.

There is an  anti-pattern  documented for this IT phenomenon in projects within organizations exempt from accountability and competitive pressure or managerial consequences ( and perhaps 2 or more anti-patterns specific to this web design and its survival on this academic web site.)

Ah, the peaceful, never over-loaded servers of  e-text  initiatives in academe ! Concurrent user load ? Not a worry.



Enhanced by Zemanta

Monday 22 October 2012

kanji urlencoded

In the aule-browser Henshall kanji app, I have added the UTF-8 as you would see it in an urlencoded string :




Thursday 11 October 2012

kanji by UCS

Here are some the the first, basic Japanese kanji sorted by their UCS value (UTF-16 codepoint)


一  丁  七  万  丈  三  上  下  不  与  世  丘  丙  両  並  中  丸  丹  主  久  乏  乗  乙  九  乱  乳  乾  事  二  亜  享  京  亭  人  仁  今

Is a particular pattern evident or helpful ?

I am in the process of adding a UCS - to- kanji -to- urlencoded - utf-8 page over at kanji.aule-browser.com which uses the Curl web content language (only a few lines of declarative script and a wee bit of procedural script required.)

UPDATE : that page with the HTML urlencoding for each character is at http://www.aule-browser.com/kanji/henshall-sorted-urlencoded.html .

A simpler plain HTML page is http://www.aule-browser.com/kanji/henshall-sorted-by-unicode.html .

Another safe, plain HTML page with no scripts, images, ads or other nuisance has the 1,945 Hernshall basic Japanese kanji sorted as they appear in the book - by their so-called Henshall number - is at http://www.aule-browser.com/kanji/henshall-sorted-by-id.html.

By viewing the page source in your browser you can see that there are no script or image elements to worry about - so you can safely copy this HTML text to your local machine to edit as you see fit.

The HTML text was generating using a Curl applet running off-line and parsing the Kanjidic2 XML.



Tuesday 7 August 2012

岸 河原


From one kanji  to a compound 河原 via a known kanji  and I finally "grok" them both !

see also:
遠島
火山列島
 
at edict2 in utf-8 PLAIN HTML (page may take 30 seconds to load - no scripts)

Be sure your browser view has character encoding set to utf-8 if you have an issue.

Split HTML pages (6 ? 7 ?) in lovely free HanaMinA font later today.