Xml Wonderland

DocBook XML/SGML Processing Using OpenJade

http://www.study-area.org/tips/doctrans/doctrans.html http://www.study-area.org/tips/docw/docwrite.html 老貢生

#Saturday, January 07, 2006 posted by samuel @ 6:35 AM 0 comments

PassiveTex Notes

1. xsltproc -o teiu5.FO ../FO/tei.xsl teiu5.xml 1. pdfxmltex teiu5.fo <====會產生pdf檔在tex中,其設定檔為/etc/texmf/texmf.cnf samuel@pika047:Xml$ grep save_size /etc/texmf/texmf.cnf save_size = 50000 % for saving values outside current group 如果出現 TeX capacity exceed的錯訊息, 則改save_size值的大小如果有2筆設定值,則TeX只會戴入笫1次的設定值

#Wednesday, January 04, 2006 posted by samuel @ 12:22 AM 0 comments

PassiveTex

另外在Google reader中有找到PassiveTex的文件，當作參考。

#Friday, December 30, 2005 posted by samuel @ 1:16 AM 0 comments

OPML&Google Reader

經由Joseph’s Blog的介紹，才知道有google reader，上網找了一下發覺有更多的XML應用程式，其中之一是OPML。

OPML -- Online Processor Markup Language, OPML an XML-based format that allows exchange of outline-structured information between applications running on different operating systems and environments.

同時它也有editor，是透過Ontopia Knowledge Suite (OKS)來寫的。

#Tuesday, December 13, 2005 posted by samuel @ 6:14 PM 0 comments

apache2+tomcat+eXist@Ubuntu

首先在Ubuntu-5.10中安裝了(apache2+tomcat)，只要確認下列套件有安裝:

apache2 apache2-common apache2-mpm-worker apache2-utils libapache2-mod-jk2

修改/etc/apache2/sites-available/default的設定檔，讓系統預設的ServerRoot從 "DocumentRoot /var/www/"變成"DocumentRoot /var/www/apache2-default"，這樣browser連上http://localhost，就會看到預設的網頁，並將apache所提供的index.html.zh-tw.big5，用iconv成utf8的文字。

請不要用Ubuntu-5.10內設的gcj/gij 這樣執行jar會有問題，請從java.sun.com下載jdk,並且執行自動解壓縮檔jdk-1_5_0_06-linux-i586.bin，如此一來，JAVA_HOME就會設在$HOME/jdk1.5.0_06，依http://exist.sourceforge.net/quickstart.html步驟來安裝。

wget http://nchc.dl.sourceforge.net/sourceforge/exist/exist-20051203.war
wget http://nchc.dl.sourceforge.net/sourceforge/exist/eXist-snapshot-20051203.jar
java -jar eXist-{version}.jar
mkdir exist;cd exist;jar xfv ../exist.war
bin/startup.sh -- 來啟動eXist

其實startup.sh就是執行以下的命令列:

/home/samuel/jdk1.5.0_06/jre/bin/java -Xms16000k -Xmx128000k -Dfile.encoding=UTF-8 
-Djetty.home=/home/samuel/eXist/tools/jetty -Dexist.home=/home/samuel/eXist 
-Djava.endorsed.dirs=/home/samuel/eXist/lib/endorsed -jar /home/samuel/eXist/start.jar jetty

從以下的連結，來看狀態

現在只剩Cxdb的設定了。

# posted by samuel @ 6:12 PM 0 comments

4Suite@Ubuntu-5.10

如果使用Ubuntn-5.10，會安裝以下這些套件:

python-4suite python-4suite-common python2.4-4suite

但是使用Amara-1.0會有問題，表示source.list中4Suite的版本少於1.0b1。故從cvs，取得最新的版本，方法如下:

cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot login(沒密碼)
cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot get 4Suite

然後就利用disutils，安裝4Suite進入系統內python setup.py install，執行Amara就比較沒問題。

關於XML及XPath的簡介文章，作者為Uche Ogbuji

Schematron tools

# posted by samuel @ 6:11 PM 0 comments

Amara&4Suite in XML

Version 1.1.7 (2005-12-13)
ftp://ftp.4suite.org/pub/cvs-snapshots/Amara-CVS.tar.bz2

* Deprecate xml_text_content property
* Add xml_child_text property that concatenates all immediate child
  text nodes (no recursive descent)
* Change unicode coercion for documents and elements to recurse through
  all descendant text (now analogous to XPath's string() coercion)
* Update allinone bundle to 4Suite 1.0b3
* Packaging fixes

Amara 1.1.7 requires Python 2.4 or more recent.  If you do not have
4Suite XML 1.0b2 or better, grab the Amara-allinone package.  If you
already have 4Suite XML installed, grab the stand along Amara package.

為了要更瞭解xml@python，以及XPath, XQuery，所以才找到最好的解決方案:

作出了下列的Xml類別，但是會出現一些的問題

from optparse import OptionParser
from sys import argv
from os.path import isdir, splitext
from os import system, getuid
from amara import parse
from amara import domtools

class XmlImpl:
        def __init__(self, XmlFile):
                self.XmlFile=XmlFile
                self.XmlDoc=parse(self.XmlFile)
        def Node(self):
                self.Nodes=[]
                for node in self.XmlDoc.xml_xpath(u'//*'):
                        XmlNode=node.nodeName
                        if not self.Nodes.count(XmlNode):self.Nodes.append(XmlNode)
                #self.Nodes.sort()
                print 'The nodes in %s are:\n%s'%(self.XmlFile,self.Nodes)
        def RootNode(self):
                for RootNode in self.XmlDoc.xml_xpath('*'):self.RootNode=RootNode.nodeName
                print 'The RootNode in %s is: %s'%(self.XmlFile,self.RootNode)

從XmlImpl中，利用amara所提供的xml_xpath，可以得到RootNode以及所有的NodeName，但是因為amara要求必需要用 doc.ClinicalDocument.legalAuthenticator.xml()的方式，才能得到xml segments的內容，所以不可能直接寫在此XmlImpl類別之中，另類解法是在XmlImpl中，另訂方法，將找到的NodeName，轉而輸出成文件檔，然後再用exec的方式，讀入並且在原程式內執行，不過這樣的方式有點難懂，且有點ugly。