Thursday, 7 August 2014

The 392 unique kanji in recognition exercise Text 3


漢字 · かんじ

Aule Kanji Pages · Kanji Recog Pages

The remaining  392 unique kanji from my last plain text file posted here are as follows (no duplicates here ! ) . . . 

丸丹之乗乞争亘他仮仰件伊休似余依俊保信倉
個傍傑催僧先克全兼冷凝刈判別制割創劣努励
動務勢匠匹協単収受各后向否呉哀員善器坂坊
坪城執培塩増墨壊売夢奇奈奔妙妥委姿宇官宝
実客容寄寅富寝専将尊尾屈屋展属層岬岳峰崇
巧巨帰幅幕幸幹底座庫庵庶庸弟弥張強往待徐
得御徳念急恭恵悲情愛慈慮慶憶我扇打批承招
拝拡指掘推掲換搬摘摩改放故斉易昭晩普智暇
曹曽最朋望朱条松果架査栄栗根桑梨極楼横橋
機欄欧武歴死残段比毛永求沼泰洞津活浜浮添
済渉渋渓渡温測湖湧湯湾満準溶滋漢潮澄激濃
瀬無照熊父状狭玄玉珍瑠甲畔疎登盆盛盟真眺
眼睦知破示福秘究窓竜章竹第筆筋粉精約紅紫
細紹絵継続綱網緑緒線縁職育背能臨舎舞芝芳
芸苦茨荒荘菜華菱落蔵薬藍藤蘇蘭衆衡衷裏複
視覚親訓訪評詩詳誌認譲護谷販貫貯貴賀資質
越趣足跡跳身軍軒軸輩迎追退速遊運遍遣遺郊
郎郡郷里釜釣鉄銀録鎌鎮鏡閣阪阿院隆階随際
隠雀雄雇雑離難雪雲霊青革音頃項領頼願類飛
飲飾養香馬駆駒魂魚鳥鹿鼓

But I, for one, find the above ordered set less useful and even more difficult to scan or have an kanji hold my attention let alone elude my scanning gaze.


common kanji : a free plain text with 875 characters for KanjiRecog exercises


漢字 · かんじ

Aule Kanji Pages · Kanji Recog Pages

After removing those not in KLC or the old 2,500 list of common newspaper kanji, we are left with a file of 875 characters with MANY visible duplicates.

Can you find a row with only ONE pair of duplicates ?

Can you scan slowly for a row with duplicates none of which is known to you ? Never before seen or just an unknown meaning or reading ?

弥哀徳尊庶霊沼鳥飛歴魚沼満跳故恭遊后菜培実質恭傍華
段階妥推弥呉橋斉斉済帰畔弥呉橋根欄橋蘇我馬珍評判録
掘済崇否争崇蘇我飛鳥蘇我済推創浮鳥舞軸線溶鳥隠背岳
雄御身得示能測隆伊賀掘遺跡城之越遺跡遺跡指保護遺跡
属落状護岬屈添改武悲舎残荒掘受継録浜荒幹瀬頃追憶努
展先駆示飛鳥城跡掘得知昭掘査城条坊坪最幅細屈底玉掘
催線複雑湾底玉縁奈濃緑紫層属盆縁層質冷院朱雀院院残
富湧貯巧往姿郊離別荘頃覚離遺貴遺曽跡貴寝普遍寝寝遣
院塩釜松浮条院丹橋線奈受継催情緒換活絵絵漢詩仮院毛
越極拝架橋渡御越詳俊綱残割遣秘得張貫乞求求克遣展先
駆示情緒遣筋身父院藤頼藤頼専武打頼鎌倉受継永福頼尊
毛越無院精舎荘激死弟藤泰衡将鎮魂鎌倉育委員階阿弥薬
藍遺認眼掘査継続約掘果藍徐鎌倉資領荘得第巨富富御臨
釣松訪藤瑠澄比類激増鏡眺望巧将軍足満荘譲受拡荘第層
楼閣舎閣閣望楼閣視橋閣往能竜鏡湖丸極難足劣満洞御迎
幸仰満死鹿閣際破壊放職章承鎌倉僧隆盛墨詩院院院狭凝
院院余刈段橋架横橋浮照夢窓疎輩夢窓疎愛芳傑測知里似
夢窓疎遊苦録残他芸匹最峰夢窓疎芳鎌倉帰僧蘭渓隆徳院
養院座視智院衆湯飲器客湯寄屋湯客玄別専細座向機能趣
待庵訓郡官休庵将軍城屋際遊遊盛屋遊遊照兼栗香松趣熊
熊屋随壊状郎職務余暇録資収退職究励収資余単究伊勢栗
催展身故郷桑松信華究漢漢詩晩詩激争展身横芸阿弥藤紹
衷渋栄庸菱親睦網別慶雲朋無庵湖芝雲荘坂浜慶雲城亘郎
各受継扇湖荘郎座掲匠視展掲眺望視線類趣強飾郎津荘継
昭頃寅郎推雑継急速雑全強類求運搬容易照項雑保鉄狭浮
活改善運動視覚評機能視視打活改善盟綱領項協協究宇精
執筆保奇屋紹究全測昭院院批判打兼雪準慮指摘兼茨城別
照毛越郡院郡駒根乗谷倉福福芳恵指梨甲妙退蔵院曹徳院
鹿閣徳院慈照銀閣徳院乗院奈奈養院庫粉竹知知院粉福芳
徳院玉養福福福福賀青滋賀軒蘭滋賀浜玄滋賀根滋賀津離
院離宝院願院条城丸御院徳詩松根城丸紅渓養音院鳥鳥徳
城徳城御閣徳徳松玄栗条城城丸紅渓音院渉慶雲滋賀浜無
庵雲荘慶阪阪荘依奈奈裏妙庫登録念最登録件温荘潮遊個
最無慶温荘昭福松尾松城根阪根足根根専誌念照欧鼓橋奔
放制職招制幕摩動伊郎雇革真真視覚販売際第遊遊盛屋遊




Wednesday, 6 August 2014

Kanji duplicate reduction script


漢字 · かんじ

Aule Kanji Pages · Kanji Recog Pages

At 975 characters left in the file (fewer than 40 rows of 25), it is time to consider removing those kanji that are neither frequent in use nor rated 'general use'. But that requires a software script ... or an app ! 

弥哀徳尊庶霊沼鳥飛歴魚沼満跳故恭遊后蔬菜培実質允恭
傍櫻華段階妥推弥呉橋斉斉済帰畔弥呉橋根欄橋蘇我馬嶋
珍評判録坦掘済崇否争崇蘇我飛鳥蘇我済推創浮鳥舞軸線
溶鳥隠背岳雄御身得示能測澤隆伊賀掘遺跡城之越遺跡遺
跡指保護遺跡属箇涌落涌状護岬屈添改武悲舎残荒橘掘受
継録浜荒幹瀬頃追憶努展先駆示飛鳥城跡掘得知昭掘査城
条坊坪最幅細屈底玉掘催汀線複雑湾底玉縁奈濃緑紫層属
盆縁層質冷院朱雀院淳院残富湧貯巧往姿郊離別荘頃嵯峨
覚嵯峨離遺貴遺曽跡貴寝普遍寝寝遣院塩釜松浮条院丹橋
線奈受継莫催情緒換活絵絵漢詩仮院毛越極拝架橋渡御曼
越詳橘俊綱残割遣秘得張貫乞求求克遣展先駆示情緒遣筋
身父院藤頼藤頼専武打頼鎌倉受継永福頼尊毛越無院精舎
荘激死弟藤泰衡将鎮魂鎌倉育委員階阿弥陀薬伽藍遺認眼
掘査継続約掘果伽藍徐鎌倉卿資領荘得第巨富富御臨釣松
訪藤瑠璃澄比類激増鏡眺望巧将軍足満荘譲受拡荘第層楼
閣舎閣閣望楼閣俯瞰視橋閣往能竜瀑鏡湖丸極難足劣満洞
御迎幸仰満死鹿閣際破壊放職鳳章承鎌倉僧隆盛宋墨詩院
院院狭凝院院余刈段橋架堰横橋浮照夢窓疎輩夢窓疎愛芳
傑測知里似夢窓疎遊苦録残他芸匹最峰夢窓疎芳龍瑞鎌倉
帰僧蘭渓隆徳院龍養院座視智院堺衆湯飲器客湯寄屋湯客
玄別専細座向機能趣待庵府訓郡官休庵将軍城屋際遊廻遊
盛屋遊遊照兼栗香松趣熊熊屋随壊状圭郎職務余暇録資収
退職究励収資余単究伊勢栗催展身故郷桑松信九華究漢詣
漢詩晩詩激争展身横芸阿弥藤紹衷渋栄庸菱親睦網別慶雲
縣朋無鄰庵琵琶湖疏芝碧雲荘坂浜慶雲甥城亘郎各受継扇
湖荘郎座掲匠視展掲眺望視線類趣強飾郎津蘆荘継昭頃寅
郎推雑継急速雑全強類求運搬容易照項雑保鉄狭浮活改善
運動視覚評機能視視打活改善盟綱領項協協究宇精執筆保
也奇屋紹究全測昭院院批判打兼雪準慮指摘兼偕茨城別照
毛越磐郡院磐郡駒根乗谷倉福福芳苔府龍府恵指梨甲妙退
蔵院府龍曹府徳院府鹿閣府徳龍院府慈照銀閣府圓徳院府
府乗院奈奈養院庫粉竹知知院粉福芳徳龍院玉養浩福福福
福敦賀青滋賀軒蘭滋賀浜玄滋賀彦根滋賀津桂離府院離府
醍醐宝院府願院府条城丸御府府院府徳府詩府府松府幡根
城丸紅渓養翠音院鳥鳥徳城徳城御閣徳徳松玄栗条城城丸
紅渓音院渉慶雲滋賀浜無鄰庵府府府碧雲荘府慶阪府阪荘
府依奈奈裏妙庫登録念最登録件温荘潮遊個最無鄰菴慶温
荘昭福府松尾松府城府根阪府堺堺根足根根専誌念照欧鼓
橋篭奔放制職招制幕府薩摩動伊郎雇革真真視覚販売際第

The duplicate kanji are VERY visible now. 

is one example in the row

紅渓音院渉慶雲滋賀浜無鄰庵府府府碧雲荘府慶阪府阪荘

In the almost 40 rows above it, occurs 33 times. Run through the lines again. Do you start to see them ? Try separating the lines with a blank line. Try shortening the lines.

You could first remove all '\n' linefeeds, and then use a regexp such as

FIND  expression :   (..........)
REPLACE expr:     \1\n

which in Notepad++ will give almost 100 rows of 10 characters for the file above. It says "for every 10 characters, return that selection followed by a linefeed."

The one ( 1 ) in the case above refers to the first expression in parentheses.  If we'd had a second, it would have been \2.

What features make complex kanji appear to be the same ? Are some easily confused if not next to each other ?

Funny, but when I read 鳥鳥 side-by-side I just KNOW that is not horse ! But seen alone, I can be unsure ... horse or bird ? Crow ?