This file contains a short description of the DTD used for saving/loading Grammars in ClarkSystem. There is one example XML document which represents the serialization of one Grammar from the system.

<!DOCTYPE Grammars [

<!ELEMENT Grammars (grammar)+>

<!ELEMENT grammar (name,author?,Comment?,xpath?,tokenizer?,filter?,line+)>

<!ATTLIST grammar
    normalize (yes|no) "no"
match_body (long|short|any_up|any_down) "long"
    match_left (long|short) "long"
    match_right (long|short) "long"
>

The attributes of the grammar represent the matches of the grammar and normalize option

<!ELEMENT name #PCDATA>
<!ELEMENT author #PCDATA>
<!ELEMENT tokenizer #PCDATA>
<!ELEMENT filter #PCDATA>
<!ELEMENT xpath #PCDATA>

<!ELEMENT line (LC?,RE,RC?,RM,Comment?)+>

<!ELEMENT LC #PCDATA>
<!ELEMENT RE #PCDATA>
<!ELEMENT RC #PCDATA>
<!ELEMENT RM #PCDATA>
]>

Example :

<?xml version="1.0"?>
<!-- This Document is created with the Clark System! http://www.bultreebank.org -->
<Grammars>
    <grammar normalize="no" match_left="long" match_body="long" match_right="long">
        <name>tag</name>
        <author>Clark System</author>
        <tokenizer>MixedWord</tokenizer>
        <ElValues></ElValues>
        <line>
            <LC></LC>
            <RE>&quot;quick&quot;|&quot;brown&quot;</RE>
            <RC></RC>
            <RM>&lt;adj&gt;\w&lt;/adj&gt;</RM>
            <Comment></Comment>
        </line>
        <line>
            <LC></LC>
            <RE>&quot;fox&quot;</RE>
            <RC></RC>
            <RM>&lt;n&gt;\w&lt;/n&gt;</RM>
            <Comment></Comment>
        </line>
        <line>
            <LC></LC>
            <RE>&quot;the&quot;</RE>
            <RC></RC>
            <RM>&lt;det&gt;\w&lt;/det&gt;</RM>
            <Comment></Comment>
        </line>
    </grammar>
</Grammars>

example.gif (22377 bytes)