Part of Nobumi Iyanaga's website. n-iyanag@ppp.bekkoame.ne.jp. 11/3/03.

logo picture

Two Clipboard Utilities for OS X

These are two little clipboard utilities written in AppleScript, to be used in Script menu in OS X. One is called "convertUnicode2STXT" and the other is called "clip2pure_text".

convertUnicode2STXT

Depending on system configuration, you may have difficulties to copy text from a OS X Unicode savvy application, like TextEdit, Mail.app or OmniWeb, and paste it to a Classic application, like Nisus Writer, or even in OS X Finder. For example on my machine, when I copy a text in Japanese or Chinese in OmniWeb and I switch to Finder, and I choose "Show Clipboard" in Edit menu, all I see is a sequence of "???". And of course, if I paste this text in Classic Nisus Writer, all I get is "???..."
This problem seems to be due to several factors that are beyond my understanding, but involving at the same time the font and the encoding. Looking at the clipboard data using utilities like PasteboardWatcher or Pasteboard Manager (these programs seem now unavailable online. If you need them, please write me personally) , I see that the contents of OS X clipboard is a complicated record, consisting of (for example):

NSStringPboardType Unicode data??
NeXT Rich Text Format v.1.0 pasteboard type rtf data
NeXT plain ascii pasteboard type pure ASCII text
CorePasteboardFlavorType 0x52544620 rtf data
CorePasteboardFlavorType 0x75747874 'utxt' data
CorePasteboardFlavorType 0x7573746C 'ustl' data
CorePasteboardFlavorType 0x54455845 'TEXT' data
CorePasteboardFlavorType 0x7374796C 'styl' data

We can examine also the contents of clipboard with the following AppleScript code:

the clipboard as record

In the case of Japanese text copied in OmniWeb, I have "???..." in "NeXT plain ascii pasteboard type" and in "CorePasteboardFlavorType 0x54455845" (TEXT data), while there are readable data in "NSStringPboardType" (which seems to be the Unicode data), "NeXT Rich Text Format v.1.0 pasteboard type" and "CorePasteboardFlavorType 0x52544620" (rtf data) [these two are the same; the data are in Unicode value]. The data in "CorePasteboardFlavorType 0x75747874" (utxt data), "CorePasteboardFlavorType 0x7573746C" (ustl data) and "CorePasteboardFlavorType 0x7374796C" (styl data) seem empty in PasteboardWatcher, but I think this is only because PasteboardWatcher's windows are unable to display these kinds of data. [PasteboardManager's behaviour is somehow different, but I think I can avoid talking about this...]

The AppleScript code returns something like this:

{styled Clipboard text:«data styl000...[hex data]», string:"??????? ", uniform styles:«data ustl0000000...[hex data]», Unicode text:"[some Japanese readable text]", «class RTF »:«data RTF 7B5C72746...[hex data]»}

The font used for Japanese text in OmniWeb is Hiragino Kaku Gothis W3; but we can specify only two fonts in OmniWeb, one for the Proportional font and another for the Fixed-width font...; I use Thryomanes Regular 14 for the Proportional font, and Courier Regular 14 for the Fixed-width font. Hiragino is a Unicode font family (an OpenType font), available only in OS X environment, and not in Classic applications. If I could use Osaka for example, which is available in both environments, I would be able to copy text in OmniWeb and paste it into Classic Nisus Writer. Unfortunately, this seems not possible or good for me, because I need Unicode fonts in OmniWeb. By the way, PasteboardWatcher, for example, shows the contents of rtf data in readable format (not as hex data), but that data is not correct, since the font is specified as Thryomanes and the size "12" (the font should be "Hiragino Kaku Gothis W3" and the size "14").

From all this, we can presume that even if the font used in an OS X Unicode savvy application is not available in a Classic application, there is always the Unicode data. So, extracting this Unicode data from the clipboard record, and converting it to a Styled Text data in Classic OS format, it will be possible to paste the text to a Classic application. This is what my script "convertUnicode2STXT" tries to do. Here is this script:

try
	try
		set theText to Unicode text of (the clipboard as record)
	on error errMsg
		try
			set theText to (the clipboard) as Unicode text
		on error errMsg
			
		end try
	end try
	try
		set theText to Unicode2StyledText theText fromCode "UNICODE-2-0"
	on error errMsg
		
	end try
on error errMsg
	display dialog errMsg
	return
end try

set the clipboard to theText

I borrowed the idea of the first lines, in which I try to get the Unicode data from the clipboard record, from John Delacour's web page http://www.bd8.com/eudora/multilingual/, and his script "Paste clip -> utf8" that can be downloaded from that page.

The core of the script is the line:

set theText to Unicode2StyledText theText fromCode "UNICODE-2-0"

This uses TEC OSAX, the scripting addition which serves as a front end for Text Encoding Converter built-in in Mac OS. The scripting dictionary for this scripting addition says:

Unicode2StyledText: Convert Unicode text to styled text
	Unicode2StyledText  string  -- Unicode text to convert
		[fromCode  string]  -- Unicode encoding name (default:"UTF-8")
		[preferredScript  small integer]  -- preferred ScriptCode of converted text
	Result:   styled text  -- converted styled text

As you can see, there is an optional parameter "preferredScript". This may be useful especially if the copied text contains Chinese or Japanese or Korean. Because of Han Unification in Unicode, the Chinese characters may be converted to several legacy "scripts", Shift_JIS, Big5, GB or EUC-KR. If you don't specify this parameter, the "preferred ScriptCode of converted text" will be the System script in Classic OS (I think...); if you specify for example:

set theText to Unicode2StyledText theText fromCode "UNICODE-2-0" preferredScript 1

the "preferred ScriptCode" will be set to Japanese. You will find detailed data for "script code" in http://developer.apple.com/documentation/Carbon/Reference/Script_Manager/scriptmgr_refchap/enum_group_1.html#//apple_ref/doc/constant_group/Script_Codes; but what is important to know is:

Japanese	1
Traditional Chinese	2
Korean	3
Simplified Chinese 25

So, if you need, please add the preferredScript parameter to the script.

How to Install and Use...

  1. Please download TEC OSAX from the link above, and put the OSAX in your "ScriptingAdditions" folder, in your Library:
    /Users/[your_account]/Library/ScriptingAdditions/
    If "ScriptingAdditions" folder is not in your Library, please create it [note that there is no space between "Scripting" and "Additions"].
  2. Please create a folder named "ClipboardUtilities", in your "Scripts" folder, in your Library:
    /Users/[your_account]/Library/Scripts/ClipboardUtilities/
    (If "Scripts" folder does not exist in your Library, please create it).
  3. ...And put my script "convertUnicode2STXT" in that folder.
  4. Open your "AppleScript" folder in "Applications" folder, and double-click on the icon of "Script Menu.menu": this enables the "Script" menu available in all the OS X applications. In that menu, you will find a menu item "ClipboardUtilities", and a sub-menu item "convertUnicode2STXT"

When you have installed TEC OSAX and my script "convertUnicode2STXT", you will be able to use it. You will:

The style will not be copied -- so if necessary, you will set your style to the pasted text.


clip2pure_text

This is a simple little "bonus". Running this script, you will remove any style data from your copied text IF THIS IS POSSIBLE. The script is:

try
	set theText to string of (the clipboard as record)
on error errMsg
	display dialog errMsg
	return
end try

set the clipboard to theText

To use this script, please put it in your "ClipboardUtilities" folder, in your "Scripts" folder, in your Library:
/Users/[your_account]/Library/Scripts/ClipboardUtilities/


Download

The two scripts can be downloaded from this link (4 KB).

I hope these scripts will be of some use for you.
Please write me if you have any feedback, bug reports, suggestion.

Thank you in advance!


Go to Research tools Home Page
Go to NI Home Page


Mail to Nobumi Iyanaga


frontierlogo picture

This page was last built with Frontier on a Macintosh on Mon, Nov 3, 2003 at 10:17:23 PM. Thanks for checking it out! Nobumi Iyanaga