Part of Nobumi Iyanaga's website. n-iyanag@ppp.bekkoame.ne.jp. 9/27/00.

logo picture

Macro Open JPN? text - a thread

To: nisus-hub@egroups.com From: Kino <quinon....> Mailing-List: list nisus-hub@egroups.com; contact nisus-hub-owner@egroups.com Delivered-To: mailing list nisus-hub@egroups.com List-Unsubscribe: <mailto:nisus-hub-unsubscribe@egroups.com> Date: Sun, 17 Sep 2000 20:40:54 +0900 Reply-To: nisus-hub@egroups.com Subject: Re: [nisus-hub] Working on bilingual files Status:

Hello Rick, Hello everyone

On Sat, 16 Sep 2000 12:02:17 "Rick Davis" wrote: > [...] Since I get the files from other people, they're plain text with no script information. First I need to see what I've got, so I apply a Japanese font to the whole thing. [...]

Below is a macro which I had written some time ago and I just modified in order to make it safer for other people's use -- I'm not sure, though. The macro considers the file to be in Japanese script if it contains "A with upper ring" immediately followed by A, B, C or D, which correspond Japanese periods or commas (2 byte). I think this way of distinction will work fine in most of the cases, though I don't know what these 2 byte codes are in Chinese or Korean script, nor if these couples of letters are used as some abreviation in Scandinavian countries or not. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

// ---------- macro Open JPN? text ---------- // This macro assumes... // 1. Nisus Text Stationery is NOT in JPN font, // 2. Catalog is the frontmost window and // 3. one or more file is already selected.

Show Catalog MacroCopy If (clipboard == "") goto warning key //"key" immediately followed by "space" char. Show Clipboard Find All "\r" "A-Sgt" num_file = NumFound Close // Is there a more cleaver way to know the number of files // to be opened?

loop: Find Next "Å[A-D]" "SA-G-itg" // This expression should be written in MacRoman font. // $81[$41-$44] matches Japanese periods and commas // when they are in MacRoman font. // Maybe $81[$41$43] or $81[$42$44] will be sufficient.

If (NumFound == 0) goto no_jpn SetSelect (0, EndCharNum) {shift}Osaka // Shift (macro symbol) immediately followed by font name. // Personally, I don't like Osaka :-) Times SetSelect (0, 0) // Please customize the llines above between two SetSelect as you like. no_jpn: Send Back num_file = num_file - 1 if (num_file !=0) goto loop exit

warning: clipboard = "Execute the macro again after having selected file(s) to open." :1 '\CC' // End of the macro. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Any idea of improvement is welcome.

Yusuke KINOSHITA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


...To which Håkan Friberg replied...:


To: nisus-hub@egroups.com User-Agent: Nisus Email From: "Håkan Friberg" <hakan.friberg....> Mailing-List: list nisus-hub@egroups.com; contact nisus-hub-owner@egroups.com Delivered-To: mailing list nisus-hub@egroups.com List-Unsubscribe: <mailto:nisus-hub-unsubscribe@egroups.com> Date: Sun, 17 Sep 2000 16:21:58 Reply-To: nisus-hub@egroups.com Subject: Re: [nisus-hub] Working on bilingual files Status:

On Sun, 17 Sep 2000 20:40:54 +0900 Kino wrote: > >Hello Rick, Hello everyone > >On Sat, 16 Sep 2000 12:02:17 "Rick Davis" wrote: >> [...] Since I get the files from other people, they're plain text with no script information. First I need to see what I've got, so I apply a Japanese font to the whole thing. [...] > >Below is a macro which I had written some time ago and I just modified in order >to make it safer for other people's use -- I'm not sure, though. The macro >considers the file to be in Japanese script if it contains "A with upper ring" >immediately followed by A, B, C or D, which correspond Japanese periods or >commas (2 byte). I think this way of distinction will work fine in most of the >cases, though I don't know what these 2 byte codes are in Chinese or Korean >script, nor if these couples of letters are used as some abreviation in >Scandinavian countries or not.

snip

Kino,

great macro. The uppercase A-ring might cause problems with scandinavian languages, at least Swedish, since Åb- and Åd- are the first two letters in a number of Swedish surnames and place names. A lot of people indulge in the bad(?) habit of writing surnames with capital letters, especially in lists, and this might cause some confusion.

I don't think it would be a big problem though (unless your own surname happens to be Åberg or Ådahl, or if you live in Ådalen), since the number of occurances would be fairly low.

greetings from chilly and sunny Stockholm, Håkan

PS Just one tip: Since all values except zero make num_file TRUE, you don't have to write if (num_file !=0) goto loop It is enough to write if (num_file) goto loop This trick doesn't save you a lot of time, but it somehow makes you feel like a "macro wizard" of sorts, which might be fun.

-- Håkan Friberg Center for Pacific Asia Studies Stockholm University, S-10691 Stockholm, Sweden Phone: +46 (8) 16 27 22 Cell: +46 (708) 187 184 Fax: +46 (8) 16 88 10 e-mail: hakan.friberg@orient.su.se (Apple G3/300, 64 MB real RAM, Mac OS S1-8.5.1, Nisus Writer 5.1.3) --------------------------------------------- "Why walk when you can dance?" -- Lisa Meehan, age 7; submitted by her father, Greg Meehan


...To which Yusuke Kinoshita replied...


To: nisus-hub@egroups.com From: Kino <quinon....> Mailing-List: list nisus-hub@egroups.com; contact nisus-hub-owner@egroups.com Delivered-To: mailing list nisus-hub@egroups.com List-Unsubscribe: <mailto:nisus-hub-unsubscribe@egroups.com> Date: Mon, 18 Sep 2000 14:00:45 +0900 Reply-To: nisus-hub@egroups.com Subject: Re: [nisus-hub] Working on bilingual files Status:

Hello Håkan, hello everyone

On Sun, 17 Sep 2000 16:21:58, Håkan Friberg wrote: > great macro. [...]

Many thanks for your encouragement. Though I know well the macro in question is mainly composed of borrowed elements, I'm happy to hear such kind words from you, especially because it's your detailed commentary on Punctuation macro, with which I had been struggled a year ago or so in the another list, that introduced me to the grep expression and the macro language. I have read your explanation again and again.

> [...] The uppercase A-ring might cause problems with scandinavian > languages, at least Swedish, since Åb- and Åd- are the first two letters in > a number of Swedish surnames and place names. A lot of people indulge in > the bad(?) habit of writing surnames with capital letters, especially in > lists, and this might cause some confusion.

So perhaps it would be better to use Find "Å[BDH]\r" "oSA-itg" ($81[$42$44$48] return_code) instead, I think, though I cannot affirm about its efficiency until I will have tested it on a sufficient number of documents. If it is so, as for files with html tags, before executing the macro, it would be necessary to remove tags by something like Replace all "<.:+>" "" "SA-itg".

It is hard to believe that there is a *modern* Japanese document without any period ($81$42 or $81$44) nor question mark ($81$48) at one of its paragraph ends, even if the document is a short email and the most of its paragraphs are not paragraph in its proper meaning. (I'm not subscribed to a "Haiku" or "Tanka" mailing list :-)

But I'm not sure. The degeneration of the Japanese language of nowadays... <sigh> I'll be happy if you don't need to read a certain kind of ugly incarnation of the Japanese language.

> Just one tip: > Since all values except zero make num_file TRUE, you don't have to write > if (num_file !=0) goto loop > It is enough to write > if (num_file) goto loop > This trick doesn't save you a lot of time, but it somehow makes you feel > like a "macro wizard" of sorts, which might be fun.

Thank you for the great tip. I'll never miss a chance to feel myself more elegant than what I am :-) A part from this, I'm used to estimate highly the economization of memory, probably because I had been always troubled by insufficient memory for 10 years when I were using a Japanese localized PC (limit of 640KB!).

Salutation from Yokohama. (It is not so hot today, though it is still very humid.)

Yusuke KINOSHITA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


...To which Philip Spaelti replied...


To: nisus-hub@egroups.com User-Agent: Nisus Email From: "Philip Spaelti" <spaelti....> Mailing-List: list nisus-hub@egroups.com; contact nisus-hub-owner@egroups.com Delivered-To: mailing list nisus-hub@egroups.com List-Unsubscribe: <mailto:nisus-hub-unsubscribe@egroups.com> Date: Mon, 18 Sep 2000 15:39:35 Reply-To: nisus-hub@egroups.com Subject: Re: [nisus-hub] Working on bilingual files Status:

On Sun, 17 Sep 2000 20:40:54 +0900 Kino coded... .. >Show Clipboard >Find All "\r" "A-Sgt" >num_file = NumFound >Close >// Is there a more cleaver way to know the number of files >// to be opened?

You could replace this with...

num_file = length(clipboard) - length(clipboard/numtochar(13))

This would avoid the "light show" (and better yet it keeps the list of files selected in the Catalog).The drawback is that for very large numbers of selected (or files with very long names) it might be inaccurate, since there is a length limit on variables.

hope this helps

-------------------------------------- Philip Spaelti Kobe Shoin Women's University spaelti@shoin.ac.jp --------------------------------------


...At my request to give me his final macro, Yusuke Kinoshita wrote me...:


Date: Mon, 25 Sep 2000 01:47:34 +0900
From: Kino <quinon....>  
Reply-To: <quinon....>
X-Accept-Language: fr,en,ja
To: n-iyanag@ppp.bekkoame.ne.jp
Subject: Re: off list: WIN unicode utilities
Status:   

Hello Nobumi

[snip...]

As for the macro in question, I'm hesitating to decide what expression should be used. I found an Japanese email which does not contain any punctuation mark. Maybe an expression matching "-desu/-masu/-dearu" might be better. Or rather, it seems that some of frequently used hiraganas -- e.g. "ga", "ko", "so", "su", "te", "to", "he", "ma", but I don't know their frequency -- will not cause any problem with languages in roman script. What do you think of this?

// ---------- macro Open JPN? text ---------- // This macro assumes... // 1. Nisus Text Stationery is NOT in JPN font, // 2. Catalog is the frontmost window and // 3. one or more file is already selected. Show Catalog MacroCopy num_file = length(clipboard) - length(clipboard/numtochar(13)) // Thank you, Philip. If (!num_file) goto warning key // "key" immediately followed by a "space" char.

loop: Find Next "Å[BDH]:(:H:)*\r" "SA-G-itg" // Than you, Håkan. // The expression 'Å(=0x81)[BDH]' matches Japanese periods // and question mark followed by CR, when they are in MacRoman font. // In some cases, the expression 'Å(=0x81)[A-D]' might be more efficient, // which matches Japanese periods, commas and question mark. If (!NumFound) goto no_jpn SetSelect (0, EndCharNum) Osaka // Shift (macro symbol) immediately followed by font name. Times SetSelect (0, 0) // Please customize the lines above between two SetSelect as you like. // I'm using Saimincho instead of Osaka, for bitmap fonts take less time // to display themselves. no_jpn: Send Back num_file = num_file - 1 if (num_file) goto loop // Than you, Håkan. exit

warning: clipboard = "Execute the macro again after having selected file(s) to be opened." :1 '\CC' // end of the macro.

My suggestion: transform this macro into "Open Japanese? Chinese? text". Is this difficult?

[snip...]

Yusuke KINOSHITA


Now, here is this macro, Open JPN? tex (5K to download), that you can download.


Return to
Nisus Macros Tips Center | to Nisus Main Page


Return to NI HomePage


Mail to Nobumi Iyanaga


frontierlogo picture

This page was last built with Frontier on a Macintosh on Wed, Sep 27, 2000 at 11:31:31. Thanks for checking it out! Nobumi Iyanaga