Textantrieb | UText/1 | UText/1.2 Manual

Tags

Predefined Output Tags

The following output tags are defined by default in UText/1.2.

Tags.pm module

Basic tags supported by the kernel. These tags are always available, they are registered on instantiation under the module name ”.BASE“.

.pre: shortcut preprocess

This is a pseudotag to preprocess a string. It gets called before the string evaluation begins. It is being called possibly more than once for a single out() call, since every internal out() call triggers it again.

The default preprocessor has a shortcut expansion function that is at startup inactive. It can be activated setting with this script instruction (see Tags.pm).

set preprocess out to 1

The default .pre processor expands the following shortcuts:

For example the string

my _new site_ is great

becomes

my [i/]new site[/i] is great

This shortcut preprocess is disabled by the tag [code], that is, in the contents of this tags the _, + and * are not translated.

Note: disabling the shortuct preprocessing does not disable the calls to the pseudo-tags .pre and .post, they take always place and perform some other processing that is needed internally.

nopreprocess: evaluate without preprocessing

This tag expands evaluating its arguments with disabled preprocessing. This is useful to evaluate verbatim strings in a script running with set preprocess out to 1.

.post: output postprocess

This is a pseudotag to postprocess the string. It gets called after the string evaluation completes. For a single out() call by a script it is only called once at the very end, nested out() calls do not trigger it.

A postprocessing can be forced to occur, see out and preprocess_cleanup.

foreach: traverse text

Usage: [foreach/ <selector>]<contents>[/foreach]

The selector gets evaluated and for each unit the contents get expanded.

For example, if you have a text such as:

~website {
 ~webpage =index {
  ~title Joan's Homepage
  [...]
 }
 ~webpage =contact {
  ~title Contact Me
  [...]
 }
 [...]

Now if at a particular page you put:

[foreach/ website][v title][/foreach]

you get the list of page titles: Joan's Homepage Contact Me ...

When outputing the above, a cursor with the selector "website" is instanciated, which returns a list containing each "webpage" that is a child of "website". For each child [v title] is evaluated, getting the title of the respective page.

To format the list as a html list you can use this:

[ul/][foreach/ website][li/][v title][/li][/foreach][/ul]

The call to foreach can contain an embeded foreach, there is no limit.

This tag is the counterpart to the script function select.

v: get value

Usage: [v <selector>] or [v/ <selector>]<default value>[/v]

The selector and the default value are optional.

Evaluating [v] returns the binary data associated with the current unit. If the current unit happens not to be a binary unit, it returns the binary data associated by the default binary child of the current unit. If there is no binary data at all, it returns an empty string.

Evaluating [v <selector>] instantiates the selector and returns [v] for its first child. If the selector is empty, it returns an empty string. After evaluation the current unit remains unchanged.

Evaluating [v/...]<default>[/v] returns the given default value instead of an empty string if no binary data was found. The default value is a string that can contain embedded tags, too. For example with this code:

[v/ fulltitle][v title]. [v subtitle][/v]

you get the value of the child ”fulltitle“ or, if it does not exist, a string "{title}. {subtitle}" composed with the child values for ”title“ and ”subtitle“.

The value is expanded with $ut->out() before returning it. If a value contains tags, all of them are expanded before returning the value.

vv: get literal value

Returns the value as literal — same as v without performing $ut->out()

u: get unit information

This tag retrieves the names of the current unit, its role and type unit. If the queried unit has no name, it returns an empty string.

One can set a default value [u/]<default>[/u] that is returned if the property value is empty.

It can also be called with a selector [u <selector>], [u.role <selector>] etc., then returns the unit property for the first unit in the selector.

Example. Suppose you have such a text:

^webpage {
 h : string
 h1 : h
 h2 : h
}
[...]
~webpage {
 ~h1 =first First Things First
 [...]
 ~h1 =second The Main Point
 [...]
 ~h2 =chapter1 Chapter One
 [...]
 ~h2 =chapter2 Chapter Two
 [...]
 ~h1 =last Last but not Least
 [...]
}

Then with the expression

[foreach/ :h]<[u.role] id="[u]">[v]</[u.role]>[lf][/foreach]

you get this:

<h1 id="first">First Things First</h1>
<h1 id="second">The Main Point</h1>
<h2 id="chapter1">Chapter One</h2>
<h2 id="chapter2">Chapter Two</h2>
<h1 id="last">Last but not Least</h1>

cnum: child number

It expands as the ordinal number of the current unit inside a cursor. If called after a cursor is closed, it returns the number of items the cursor had.

sep: separator

Use inside a cursor to separate elements. It expands as the given string except at the last cursor element, where it returns nothing.

Example:

[foreach/ name][v][sep , ][/foreach].

Result: ”Mary, John, Alma.“

if: conditional values

An if-tag gets each value at a separate line with the condition in prefix notation as first word:

[if/]
    :h1     <h1>[v]</h1>
    :p      <p>[v]</p>
    :html   [v]
[/if]

This means: if the current unit has type h1 then return <h1>[v]</h1>, etc. The line order is relevant, being the first matched one returned and the rest ignored.

Conditions:

Trailing and leading whitespace is ignored both at the condition and the values. To get spaces at the beginning or the end, enclose it in ". For example:

:p " <p>[v]</p>"

this returns the value with a single leading space character. This can be used in conditions, too.

="" no name

The above returns the string ”no name“ if the current unit has no name.

load: load module

[load <module> ...] loads the given add-in modules (one or more names separated by spaces). The interpreter looks for a Perl module <module>.pm at the Perl path and under the directories UText and UText/modules. This module must contain a package <module>;. All functions in this package that correspond to hook names are automatically registered.

With the modifier ”bind“, the modules are bound after loading.

bind: set module bindings

A tag [bind <module name>] activates the hook bind for the given add-in module, in order for it to set the output bindings. It fails if no add-in module with this name exists or if it has no method bind.

unbind: remove module bindings

A tag [unbind <module name>] removes all bindings for the given module, if there are some, it does not fail, if there are not. If an add-in module with this name exists and it has a function unbind, it is called before removing the bindings.

read: read file

[read.<type>/ <filename>, <filename>, ...]<UTL pattern>[/read]

Feeds the given files at the text repository. The file names can contain wildcards * and ?. If the UTL pattern is given, the file contents are embedded in it before feeding. If the type is given, the file contents is extracted through the given type driver, for example ”odt“ for word processor documents in OpenDocument Format (see cms add-in). Example:

[read/ smith.utl]
=geneaweb
=smith ~family {
%content
}
[/read]

This tag corresponds to the kernel function getfile, see there for more information about UTL pattern arguments and adding file type drivers.

feed: feed UTL

Parses the given UTL string and enters it into the text repository. Examples:

[feed ^person :string]
[feed/]
^woman : person
^man : person
[/feed]

print: console output

[print <message>] outputs the message at the console at run time. This tag expands as an empty string, its purpose is just to inform the user about something.

lit: literal output

[lit <string>] outputs the string as it was given. If the string contains tags, they are not expanded.

utl: express in UTL

The tag [utl] expands as the UTL expression of the current unit.

The tag [utl <selector>] expresses the units returned by the selector in UTL.

Example: [utl =transformation] returns:

^transformation {
        ^function :string
}

save: save a file

To save strings as OS plain text files:

[foreach/ webpage]
    [save/ [u].html]
        ...
    [/save]
[/foreach]

The contents of the tag are saved under the given name. Both the name and the content expression is expanded with $self->out() before using and admit therefore embedded tags.

For details about file saving see the description ot the function save.

dump: dump text repository

[dump <filename>]

This tag dumps the whole text repository data into a plain text file for debug purposes.

If the file name is missing, it saves the file out.dump.

[dump.time <filename>] or [dump.t <filename>]

With the modifier ”time“ or ”t“ the timestamps of each text unit are also output.

sb: output square brackets

Square brackets are always interpreted as belonging to a tag mark. If you want to output a square bracket, you can use this tag.

lf: newline

A [lf] tag gets expanded as a new line character. This can be useful for example inside a [foreach] tag.

Note: Internally new lines are converted to [lf] when expanding tags, and at the end these are converted to new lines again. Because of that, if you want to output [lf] itself (such as in this paragraph), you cannot write at the source [sb lf], because you would get a new line, but [sb LF] instead.

inline: convert a multiline string to a single line

The tag [inline/ NL]content[/inline] expands as the content as a single line after replacing all line breaks with the given NL character or string. NL defaults to [lf].

perl: execute perl code

With this tag you can let the Perl interpreter evaluate an expression. The tag gets expanded as the results from the Perl evaluation.

Examples: [perl 3+4] will expand as 7. [perl somefunc(3,4)] calls the function somefunc with the parameterlist (3,4) and gives back its results.

The default namespace of a function is UText. If you want to call a function at another namespace the name must be fully qualified. Example:

[perl main::myfunc('hello')]

At execution time the function has a variable $self in scope that is a reference to the current UText object. So for example if the called function needs to open a cursor it can do $self->foreach(...).

Before invoking the Perl interpreter, the parameter is evaluated. For example when expanding

[perl main::important('my site [v website.title] is great')]

supposing your website's title is "Joan's Homepage" the function main::important is being called with a parameter ”my site Joan's Homepage is great“.

The execution aborts if the Perl expression returns an undefined value.

script module

The module script provides a tag for embedding script instructions in a string. It is always active: the module is loaded at startup and it is automatically bound to each instantiated UText object.

script: UText Script interpreter

Usage: [script <instructions>]

Or: [script/]<instructions>[/script]

With this tag a script can be embedded in a string. Example:

~html ''
[script/]
    print Generating family list
    out <ul>
    select #family . "yes" in-toc begin
        out <li>
        v name
        sp
        v years
        out </li>
    end
    out </ul>
[/script]
''

The output of the script is the result of the tag [script] expansion, the above would expand as a html unordered list of family names. The sentence ”Generating family list“ is output to the console when running the script and does not belong to the script's output and is therefore not included in the ~html value.

Every script instruction can be used here.

There is also a parser [*script], see Universal-Text Script for more information.

get: return setting value

Usage: [get <setting name>] or [get <provider> <setting name>]

This tag expands to the current value of the setting. The provider name can be ommited, if unambiguous.

cms add-in module

Tags to generate websites with localization support. This module is by default inactive. It must be activated before performing any out call that uses its tags with UText script:

 load cms 
 bind cms 

or in a Perl script:

$ut->load_cms()

i: html tag <i>

b: html tag <b>

em: html tag <em>

sub: html tag <sub>

sup: html tag <sup>

ul: html tag <ul>

ol: html tag <ol>

li: html tag <li>

br: html tag <br>

hellip: html entity &hellip;

mdash: html entity &mdash;

ab: angle brackets

To get angle brackets that are not to be interpreted as HTML by the browser. With [ab something] or [ab/]something[/ab] you get <something>. You get just an opening angle < with [ab.o] and a closing angle > with [ab.c].

cb: curly brackets

To get curly brackets. Same syntax as [ab].

nz: expand if non-zero length

The tag [nz/<parameter>]<expression>[/nz] expands to the expression if the parameter has one or more characters and expands to an empty string otherwise. For example:

[nz/[v title]]"[v title]"[/nz]

If the variable ”title“ is empty, this tag expands to an empty string, if not it returns the title between quotation marks.

To avoid expanding the parameter twice one can use the pseudo-variable $param, the above could be written as:

[nz/[v title]]"$param"[/nz]

z: expand if zero length

The tag [z/<parameter>]<expression>[/z] expands to the expression if the parameter is an empty string. For example:

[v title][z/[v title]](No Title)[/z]

This expands to the value of the field ”title“ or, if it is empty, to ”(No Title).“

xml: xmlify

Cuts out HTML code to convert a string to valid XML.

This tag is expanded as its parameters plus contents after removing all its html markup to make them suitable for a html header tag. For example if you write

<title>[header [v webpage.title]]</title>

and the page title is "Joan's _unbelievable_ Site", thus being [v webpage.title] expanded as Joan's <i>unbelievable</i> Site, then you will get

<title>Joan's unbelievable Site</title>

meta: header meta name line

A tag [meta/ <name>]<content>[/meta] is output as a meta line at the html header section with name and content. For example [meta/ author]Joan Cyber[/meta] becomes:

<meta name="author" content="Joan Cyber" />

The tag content is cleaned up to make it suitable as html header and eventually embedded " are converted in '.

A tag [meta <name>] loops through each variable <name> in the current unit and generates a meta tag with its value.

It is equivalent to this:

[foreach/ <name>][meta/ <name>][v][/meta][/foreach]

url: external web link

To set a link to the World Wide Web.

Usage: [url.<class>/ <absolute url> <title>]<caption>[/url]

Or: [url.<class> <absolute url> <title>]

The class modifier, title and caption are optional. The output looks like:

<a href="{target url}"
   title="{title}"
   class="external {class}"
>{elink mark}{caption}</a>

To set a link by URL to a page at the same website. This can be useful if some parts of the website are not generated by the interpreter (because otherwise you would use [n]).

Usage: [link.<class>/ <relative url> <title>]<caption>[/link]

Or: [link.<class> <relative url> <title>]

The class modifier, title and caption are optional. The output looks like:

<a href="{target url}"
   title="{title}"
   class="{class}"
>{caption}</a>

download: internal download link

To set a special type of internal link by URL that points to a download resource.

Usage: [download.<class>/ <url> <title>]<caption>[/download]

Or: [download.<class> <url> <title>]

The class modifier, title and caption are optional. The output looks like:

<a href="{absolute url}"
   title="{title}"
   class="download {class}"
>{caption}</a>

code: render code

The tag [code] gets expanded enclosing its parameters and contents between <code>...</code>. The parameters and contents get evaluated but shortcuts are not expanded. Appart from that, the characters < > & are converted to &lt; &gt; &amp; so that html code is shown and not interpreted by the browser as html markup.

img: embed image

Generates a <img> tag that can optionally link to a URL.

Usage: [img.<class> <relative image file> <url> "<caption>"]

The class modifier, url and caption are optional. The output looks like:

<img src="{target image file}" class="{class}" alt="$caption"/>

or if a target url was given the above is enclosed in this <a> tag:

<a href="{target url}" title="{caption}>...</a>

tab: tabulator

The [tab] tag is expanded as four non break spaces (html entity &nbsp;). The expanded characters can be changed with the setting expand tab.

doctype: document type line

This tag expands as a html header line with some legacy standard document type definitions.

Usage: [doctype <type> <language>]

For example [doctype xhtml11 de] gets expanded as:

<!DOCTYPE html PUBLIC "-//W3C//DTD Xcms 1.1//EN"
 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">

d: standard date

Outputs the date of a particular text unit.

Usage: [d <type> <unit name>]

Possible types are ”cr“ (creation date) and ”up“ (last update date). If the unit name is ommitted, it returns the date of the current unit. The date is formatted in the numeric common form for the current language, with 2 digits year. Example: "05/28/09" (English), "28.05.09" (German).

The modifer [d.ts ...] outputs the date and time as a timestamp (for ~timestamp units and [time ...] tags).

For debug purposes there is a modifier [d.dbg ...] that shows date and time.

dl: long date

The same as [d] but with 4 digits year.

ds: string date

The same as [d] but the date gets output as full words. Example: "Tue May 28, 2009".

time: timestamp

Given a timestamp it returns a formatted date/time string.

Usage: [time <timestamp>]

Modifiers: short or no modifier retuns a numeric date (as [d]), long returns a string date (as [ds]), rss retuns a date in format RFC822 as used in RSS feeds (i.e. Thu, 15 Jul 2010 05:29:25 +0000).

The timestamp is expected to have the format: ”yyyy-mm-dd hh:mm:ss tz“. The time zone is optional, when given the timestamp is interpreted as representing the local time for the given time zone. Support for time zones comes from operating system, all installed time zones are recognised. If no time zone is given, the timestamp is assumed to be UTC.

Examples:

[time 2010-06-08 08:01:23]
[time.rss 2010-06-08 08:01:23]
[time 2010-06-08 08:01:23 Europe/Paris]
[time 2010-06-08 08:01:23 CET]

q: quotation marks

This tag is used to insert quotation marks. Supported are typographic quotes ”“, angle quotes «» and typewriter quotes "". The quotation marks that are output on expanding depend on the current language of the interpreter using its usual typographic conventions. For example on a german text you get »such quotation marks«.

The additional modifier characters ”s“ and ”d“ allow you to get single or double quotes:

If you need to output just one quotation mark you can use additionaly these modifiers:

You can combine the modifiers of each group, no matter the order. For example [q.soa] gets expanded as a single opening angle bracket ‹, or in a german text ›.

Quotation marks expand as characters in the UTF-8 character set. To expand as legacy html entities such as &ldquo; instead, call

perl set_html_entity_quotes(0)

more: hidden text

With this tag one gets some contents on a web page that can be dynamically shown or hidden when browsing.

Usage: [more/]<some additional contents here>[/more]

This is expanded as two span elements. The first one is called more<N>short (being N a consecutive number for all more-elements inside the current page) and shows just a img/show.jpg icon. When the user clicks on it, the first element is hidden and appears the second element, called more<N>long, that shows the additional contents and an icon img/hide.jpg to collapse it again.

Requirements: The page must define this JavaScript functions:

// Details show / hide:
function show(name) {
  document.getElementById(name+'short').style.display="none";
  document.getElementById(name+'long').style.display="inline";
}
function hide(name) {
  document.getElementById(name+'short').style.display="inline";
  document.getElementById(name+'long').style.display="none";
}

The element names ”show“, ”hide“, ”more“, ”long“, ”short“ and the JavaScript function names are localised, for example in a German page they are called respectively ”zeigen“, ”verstecken“, ”mehr“, ”lang“, ”kurz“.

The shown icons with the plus and the minus sign are respectively img/show.jpg and img/hide.jpg.

Additionally the style sheet must hide and show the contents accordingly:

span.more.short { display:inline; }
span.more.long { display:none; }

file: resolve local file names

A tag [file <target file name>] gets expanded as the file name, prepending directory names or ../ if the current page and the target file are located in separate directories.

For example you can put this at each html page header:

<link rel="stylesheet" href="[file ressources/style.css]" type="text/css" />

This will output href="ressources/style.css" for all pages at the root directory of the website, and will output href="../ressources/style.css" at pages located in a first level directory, and just href="style.css" on a page that is itself at the directory ”ressources“.

n: internal link by name

This tag sets a link to another page referenced by unit name.

Usage: [n <unit-name><anchor> <title>]

Or: [n/ <unit-name><anchor> <title>]<caption>[/n]

Anchor, title and caption are optional. This gets expanded as:

<a href="{url}{anchor}" title="{title}">{caption}</a>

or, if the linked unit happens to be the current unit:

<a class="inactive" title="{title}">{caption}</a>

Note: This tag is currently unsatisfactory, because not selector aware. It should be generalised.

With a modifier you can set the class of the a html tag. For example, [n.alt name] generates the html anchor <a class="alt">.... In this case, the class for the inactive anchor is <a class="alt_inactive">....

The names of the selectors and roles to be used by this tag can be changed through the following settings (in parentheses the default value):

newsite: set the url of the website being generated

This tag calls the function new_site.

newpage: set the name of the webpage being generated

This tag calls the function new_page.

This tag renders navigation elements on a web page.

Usage: [nav.<layout> <navigator>]

Note: This tag is experimental and poor implemented!

Possible layouts are:

The following settings control the formatting of the sequence layout (default value in brackets):

The {navigator} is referenced by name. It is expected to have this structure:

^ navigator {
 ^ caption : string
 ^ item {
  ^ caption : string
  ^ title : string
  ^ page : string
  ^ item : item
 }
}

The generated html code has the css class ”navigation“.

The names of the roles and class above can be changed with the following settings (in parentheses the default value):

Before using any of them all navigators must be once loaded with:

$ut->load_navigators('<selector>');

Or in a script with:

navigators <selector>

being <selector> a selector that returns the navigator units to be processed.

Loads the navigator data to be used by tag [nav]

cref: item cross-references

This tag will provide cross-references between items.

Note: This tag is not yet implemented. By now it is just a stub for blog items that returns the name of the referenced item without linking to it.

opentag: html open tag

opentag expands as an html tag that corresponds to the current unit. For example, if the interpreter is situated at a unit with role p, it expands as <p>. If the unit has a type different from its role, the type is added as class, for example ~p :note becomes <p class="note">. If the unit has a name, it is used as identifier, for example =note1 ~p becomes <p id="note1">.

opentag type expands using the type instead of the role, for example at ~div :article it becomes <article>.

closetag: html close tag

closetag expands as a closing html tag. For example, if the interpreter is situated at a unit with role p, it expands as </p>.

closetag type expands using the type instead of the role, for example at ~div :article it becomes </article>.

UText Shell

The following tag is only available in the interactive shell (see UText Shell).

help: shell usage

Prints out a few lines with an overview on the shell usage.

ut> help
Usage:
<function>
or: <function> <parameters>
or: <function> <parameters> do <inline body>
or: <function> <parameters> begin <multiline body> end
: separate instructions inside a single line
\ instruction continues at next line
. quits the interpreter
.reset resets the UText object and the script interpreter instance
Use 'show' to get information about current settings.
Call the shell with -h to see the command line arguments.