DocBook XML/SGML Processing Using OpenJade

http://www.study-area.org/tips/doctrans/doctrans.html http://www.study-area.org/tips/docw/docwrite.html 老貢生

 

PassiveTex Notes

1. xsltproc -o teiu5.FO ../FO/tei.xsl teiu5.xml 1. pdfxmltex teiu5.fo <====會產生pdf檔 在tex中,其設定檔為/etc/texmf/texmf.cnf samuel@pika047:Xml$ grep save_size /etc/texmf/texmf.cnf save_size = 50000 % for saving values outside current group 如果出現 TeX capacity exceed的錯訊息, 則改save_size值的大小 如果有2筆設定值,則TeX只會戴入笫1次的設定值

 

PassiveTex

PassiveTex

另外在Google reader中有找到PassiveTex的文件,當作參考。

  1. XmlTo
  2. passiveTex

 

OPML&Google Reader

經由Joseph’s Blog的介紹,才知道有google reader,上網找了一下發覺有更多的XML應用程式,其中之一是OPML

OPML -- Online Processor Markup Language, OPML an XML-based format that allows exchange of outline-structured information between applications running on different operating systems and environments.

同時它也有editor,是透過Ontopia Knowledge Suite (OKS)來寫的。


 

apache2+tomcat+eXist@Ubuntu

首先在Ubuntu-5.10中安裝了(apache2+tomcat),只要確認下列套件有安裝:

apache2 apache2-common apache2-mpm-worker apache2-utils libapache2-mod-jk2

修改/etc/apache2/sites-available/default的設定檔,讓系統預設的ServerRoot從 "DocumentRoot /var/www/"變成"DocumentRoot /var/www/apache2-default",這樣browser連上http://localhost,就會看到預設的網頁,並將apache所提供的index.html.zh-tw.big5,用iconv成utf8的文字。

請不要用Ubuntu-5.10內設的gcj/gij 這樣執行jar會有問題,請從java.sun.com下載jdk,並且執行自動解壓縮檔jdk-1_5_0_06-linux-i586.bin,如此一來,JAVA_HOME就會設在$HOME/jdk1.5.0_06,依http://exist.sourceforge.net/quickstart.html步驟來安裝。

  1. wget http://nchc.dl.sourceforge.net/sourceforge/exist/exist-20051203.war
  2. wget http://nchc.dl.sourceforge.net/sourceforge/exist/eXist-snapshot-20051203.jar
  3. java -jar eXist-{version}.jar
  4. mkdir exist;cd exist;jar xfv ../exist.war
  5. bin/startup.sh -- 來啟動eXist

其實startup.sh就是執行以下的命令列:

/home/samuel/jdk1.5.0_06/jre/bin/java -Xms16000k -Xmx128000k -Dfile.encoding=UTF-8 
-Djetty.home=/home/samuel/eXist/tools/jetty -Dexist.home=/home/samuel/eXist 
-Djava.endorsed.dirs=/home/samuel/eXist/lib/endorsed -jar /home/samuel/eXist/start.jar jetty

從以下的連結,來看狀態

  1. 首頁
  2. Status

現在只剩Cxdb的設定了。


 

4Suite@Ubuntu-5.10

如果使用Ubuntn-5.10,會安裝以下這些套件:

python-4suite python-4suite-common python2.4-4suite

但是使用Amara-1.0會有問題,表示source.list中4Suite的版本少於1.0b1。故從cvs,取得最新的版本,方法如下:

cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot login(沒密碼)
cvs -d:pserver:anonymous@cvs.4suite.org:/var/local/cvsroot get 4Suite

然後就利用disutils,安裝4Suite進入系統內python setup.py install,執行Amara就比較沒問題。

關於XML及XPath的簡介文章,作者為Uche Ogbuji

  1. Introduction to XML
  2. Get started with XPath

Schematron tools

  1. the Scimitar implementation of ISO Schematron
  2. the Schematron resource

Register first for the tutorial before downloading this schematron-zip file


 

Amara&4Suite in XML

Version 1.1.7 (2005-12-13)
ftp://ftp.4suite.org/pub/cvs-snapshots/Amara-CVS.tar.bz2

* Deprecate xml_text_content property
* Add xml_child_text property that concatenates all immediate child
  text nodes (no recursive descent)
* Change unicode coercion for documents and elements to recurse through
  all descendant text (now analogous to XPath's string() coercion)
* Update allinone bundle to 4Suite 1.0b3
* Packaging fixes

Amara 1.1.7 requires Python 2.4 or more recent.  If you do not have
4Suite XML 1.0b2 or better, grab the Amara-allinone package.  If you
already have 4Suite XML installed, grab the stand along Amara package.

為了要更瞭解xml@python,以及XPath, XQuery,所以才找到最好的解決方案:

  1. Amara
  2. Chinese XML Now
  3. Uche's blog

作出了下列的Xml類別,但是會出現一些的問題

from optparse import OptionParser
from sys import argv
from os.path import isdir, splitext
from os import system, getuid
from amara import parse
from amara import domtools

class XmlImpl:
        def __init__(self, XmlFile):
                self.XmlFile=XmlFile
                self.XmlDoc=parse(self.XmlFile)
        def Node(self):
                self.Nodes=[]
                for node in self.XmlDoc.xml_xpath(u'//*'):
                        XmlNode=node.nodeName
                        if not self.Nodes.count(XmlNode):self.Nodes.append(XmlNode)
                #self.Nodes.sort()
                print 'The nodes in %s are:\n%s'%(self.XmlFile,self.Nodes)
        def RootNode(self):
                for RootNode in self.XmlDoc.xml_xpath('*'):self.RootNode=RootNode.nodeName
                print 'The RootNode in %s is: %s'%(self.XmlFile,self.RootNode)

從XmlImpl中,利用amara所提供的xml_xpath,可以得到RootNode以及所有的NodeName,但是因為amara要求必需要用 doc.ClinicalDocument.legalAuthenticator.xml()的方式,才能得到xml segments的內容,所以不可能直接寫在此XmlImpl類別之中,另類解法是在XmlImpl中,另訂方法,將找到的NodeName,轉而輸出成文件檔,然後再用exec的方式,讀入並且在原程式內執行,不過這樣的方式有點難懂,且有點ugly。


This page is powered by Blogger. Isn't yours?