Getting Started with Cora

(Awali version 2.1; notebook version 14/10/2021)

A short, but necessary, presentation of Awali.

Awali is written in C++ and is organised into three layers: the static, the dynamic, and the interface layers.

The core of the platform is the static level, which provides the data structures and a library of algorithms. The type of a weighted automaton -- called the context -- is determined by the type of weights and the type of labels. Every algorithm is implemented via template functions working on template classes, and is compiled for the type of automata that are used. Efficiency is partly based upon this generic programming which is anyway essential when dealing with weighted automata.

The dynamic level allows both the call of functions of the static level and genuine C++ programming with the use of a kind of universal type of automata. This type is an abstraction of template (static) types of automata; it gives access to a complete template-free API for weighted automata. A remarkable feature of the dynamic level is the on-request compilation of modules (data structures and algorithms) from the static level. Even written at the dynamic level, programs are compiled and not interpreted at execution time.

The interface level offers for the time being two kinds of access to the lower layers: a command-line interface, called Cora, which allows the call to most of the algorithms of the static level, and a Python interface to both the dynamic and the static levels.

The easiest way to begin with Awali platform is probably to play with the command-line module Cora and this is what is described here.

What is at hand ?

After Awali installation, the command cora help gives an overview of the usage of Cora and, as important, the way to list all available commands, examples, and documentation.

In [1]:
!cora help
cora is the line command interface to the Awali platform for
computing automata and transducers.

Usage:  cora [options] <command> <argument> [<argument2>]

where, in most cases, <command> is a function applied to <argument>, usually 
an automaton, a transducer, or a rational (regular) expression ('ratexp' in
awali parlance). The [options] bring more information to <command>, from the
type of some parameters in <argument> to the input or output format to the
method to be used for the computation.

There are more than 80 commands. To give an overview 
   cora list commands 
prints (almost) all commands, organised in a table showing 6 'kinds':
'basic-cmds', 'generic-cmds', 'wfa-cmds', 'nfa-cmds', 'ratexp-cmds', and 
'transducer-cmds', and
   cora list <kind> 
prints the list of the commands in the category <kind> together with a short
description for each command. More classically,
   cora help <cmd> 
prints a more detailed description of the command <cmd>, whereas
   cora help <option> 
gives more information on the option <option>.

By default, the result of <command> is output on the 'standard output'. This is
sensible when the result is a number, a word, or a (small) ratexp. Not when
the result is an automaton or a (long) ratexp, in which case the output is to
be redirected to a file, and the command line looks then like:

   cora <command> <argument>  >  <result-file>

Commands can be 'piped' in order to compose the commands:

   cora <command1> <argument> \| <command2> - \| <command3> -  >  <result-file>

In this case, the '-' refer to the result of the previous command. Such
composition of functions avoids the transformation, forth and back, between
the inner representation of automata (or ratexps) and their linearisation
in files.
	  
A number of automata and ratexps are stored to serve as examples in command
lines. Accordingly, a number of 'factories' allow to build automata that depend
on one, or two, parameters. Automata and expressions can take their 'weights'
or 'coefficients' in various semirings or 'weightsets'.
   cora list automata (or ratexps or factories or weightsets)
gives the list of such available examples or weightsets. 
	  
Finally, some elements of the Awali platform have been documented in such a way
that the corresponding information can be called from cora with the command
cora doc <page> in addition to being readable in the Doxygen pages.

In particular, the command cora list commands yields the list of available commands (over 80, without counting the options we'll explain below). The command

$ cora help <command>

gives more precise information on the command <command>, eg:

In [1]:
!cora help determinize
 Usage : cora determinize <aut> 

Compute the (weighted) determinization of <aut>.
	  
The method is that of (weighted) subset construction, that is, the states of
the determinization are weighted subsets of the state set Q of <aut> or, which
amounts to the same thing, vectors of dimension Q with entries in the weightset
of <aut>.	  

The weightset must be a (locally) finite semiring to insure the finiteness
of the computation. The result is a deterministic automaton and the weight
of a word is given by the final function at the state reached by that word
when input in this automaton.

The command cora list automata yields the list of preloaded automata that are part of Awali distribution (11 presently).

In [3]:
!cora list automata
Automata library :
==================

Name:            Description:                                                  
---------------- --------------------------------------------------------------
a1 ............. NFA that accepts words with at least one factor 'ab'
ab ............. NFA with epsilon transition which recognizes words in 'a*b*'
b1 ............. Z-FA that counts the number of 'b' in words of {a,b}*
binary ......... Z-FA that converts words of {0,1}^* into their binary values
binary-int ..... Z-FA that converts words of {0,1}^* into their binary values
c1 ............. Z-FA that converts words of {a,b}^* into their binary values
d1 ............. Z-FA computes the difference between the numbers of 'a' and
                 'b'
e1 ............. Q-FA that converts words of {a,b}^* into their binary values
                 after the radix point
evena .......... DFA which recognizes words with an even number of 'a'
fibotdc-lr ..... Sequential transducer which tries to replace 'abb' by 'baa'
flipper ........ Transducer that mimics a flipper game
gray ........... Gray code increment
heapmodel ...... Z-max-plus automaton, heap model with 2 pieces
lamplighter .... Group automaton which realizes the Lamplighter group
minab .......... Z-min-plus-automaton which computes the minimum among the
                 numbers of 'a' and 'b' in every word
minblocka ...... Z-min-plus-automaton which computes the length of the shortest
                 block of 'a' in every word
oddb ........... DFA which recognizes words with an odd number of 'b'
prob-rabin ..... R-automaton (probabilistic automaton) that accepts w with
                 prob. expansion(w)
rotation ....... C-automaton realizing a rotation
slowcerny ...... Synchronisable NFA which meets Cerny bound
slowgrow ....... Z-min-plus-automaton with a slow growing length function
url-validator .. Automaton with epsilon transition that validates url


The first command

A quite natural way to get used with Cora commands is to apply them on preloaded automata and, to begin with, with preloaded Boolean automata. For instance the command:

In [4]:
!cora info a1
 Automaton : a1

 Weightset : B

 Label type : Letters		Letter type : Character
 Alphabet : { a, b }

 3 states        		 6 transitions
 1 initial state		 1 final state

yields informations (number of states, of transitions, etc.) on the automaton a1.

The two lines displayed before this output explain what is going on during the waiting time. Awali is installed without that any function at the static level be compiled. When the first function is called or, later, when a function that has not been compiled yet is called, Awali recognises it, launches the compilation of a group of functions (a module) for the "type" of the automaton that is used by the command (the context), create the corresponding links, and executes the command.

If the same command is repeated, the answer is immediate (unless the computation time is noticeable) as Awali will have recognised that the necessary function is already compiled. In what follows, we will not show these messages anymore. The "context" will be explained below (or not).

In [5]:
!cora display a1

opens a pdf viewer which display a graphical representation of a1 generated by Graphviz.

Some classical functions

In [6]:
!cora determinize a1 > a1det.json

computes the determinisation of a1 and stores it, under json format, in the file a1det.json. This automaton that is just created (and which is stored in the current directory) can be called as an operand by any other command.

In [7]:
!cora -Na1det display a1det.json

displays a1det in another window. Thanks to option -N, the name of the temporary file is a1det; without it would have been tmp and the old window would have been updated.

In [8]:
!cora min-quotient a1det.json \| display -

computes the minimal quotient of a1det and displays it in the window tmp.

This line shows how it is possible to chain commands in Cora by means of a kind of "fake pipe" : the result of the first command is sent to the second one which receive it by means of the -.

For instance, Brzozowski's method for the computaion of the minimal automaton of the language accepted by an automaton may be written (on one line):

In [9]:
!cora transpose a1 \| determinize - \| transpose - \| determinize - \| display -

Awali of course implements Kleene Theorem and produces a rational (regular) expression from an automaton:

In [10]:
!cora aut-to-exp a1
(a+b)*(ab)(a+b)*

The method used to compute the expression is the state elimination method (is there really another one?) together with an elementary, natural, and efficient heuristic.

Construction of automata

A first method to constructing an automaton is to use the converse part of Kleene Theorem:

In [11]:
!cora exp-to-aut '(a+b)*(ab)(a+b)*' \| display -

The method used to building this automaton is the one of derived terms, that is the automaton obtained is the one often called Antimirov automaton. Using options, one can also build, with the standard method, the standard automaton of the expression, also called Glushkov or position automaton, which has obviously 7 states:

In [12]:
!cora -Mstandard exp-to-aut '(a+b)*(ab)(a+b)*' \| display -

or the Thompson automaton with the thompson method:

In [13]:
!cora -Mthompson exp-to-aut '(a+b)*(ab)(a+b)*' \| display -

The alphabet on which the automaton is built is implicitely the set of letters that occur in the rational expression that is considered. It could be conceived that the alphabet is indeed strictly larger, as in the following example where the option -A allows to specify the alphabet explicitely:

In [14]:
!cora -Aabc exp-to-aut '(a+b)*(ab)(a+b)*' \| display -

The result is not different from the one obtained before. The difference appears when one tries to make the automaton complete:

In [15]:
!cora -Aabc exp-to-aut '(a+b)*(ab)(a+b)*' \| complete - \| display -

Another method for building automata with Cora consists in using the edit command in order to add, or suppress, states, or transitions, to an automaton that already exists:

$ cora edit a1

or the create command to build an automaton from scratch -- with the specification of the alphabet if the default one A={a,b} is not sufficient:

$ cora -Aabcd create

These two commands open an interactive session in the terminal.

Other commands

Cora offers a large number of commands, the name of which are self-explanatory in general. In particular, it accepts all commands present in Grail or in Fado.

Notice the difference between

$ cora minimal-automaton <aut>

which computes the minimal automaton of the language accepted by the (Boolean) automaton <aut> and

$ cora min-quotient <aut>

which computes le minimal quotient of <aut> (also called minimal bisimulation).

Input--Output

The automata created with Cora (or Awali), and those which are preloaded, are stored under json format (hence the extension in the above examples). Remark that no extension is used for calling preloaded automata, it is the way they are distinguished from others.

One can write an automaton under that json format with the cat command:

In [16]:
!cora cat a1
{"format": {"name":"fsm-json", "version":"1"},
 "kind":"Automaton",
 "metadata":
   {"name":"a1",
    "caption":"NFA that accepts words with at least one factor 'ab'",
    "creator": {"programName":"awali", "version":"2.0.0-rc2"},
    "timestamp": {"date":"2021-06-22", "time":"06:35:52Z"}},
 "context":
   {"labels":
      {"labelKind":"Letters", "letterType":"Char", "alphabet":["a","b"]},
    "weights": {"semiring":"B"}},
 "data":
   {"states":
      [{"id":0, "name":"p", "initial":true},
       {"id":1, "name":"q"},
       {"id":2, "name":"r", "final":true}],
    "transitions":
      [{"source":0, "destination":0, "label":"a"},
       {"source":0, "destination":0, "label":"b"},
       {"source":2, "destination":2, "label":"a"},
       {"source":2, "destination":2, "label":"b"},
       {"source":0, "destination":1, "label":"a"},
       {"source":1, "destination":2, "label":"b"}]}}

Cora, and Awali, output, and read automata written under other description formats, namely those used by the Grail and Fado platforms (for Boolean automata only):

In [17]:
!cora -Ograil cat a1
(START) |- p
p a p
p a q
p b p
q b r
r a r
r b r
r -| (FINAL)

Rational expressions can be read, and output, under a text or a jsonformat. Under the text format, they are written between quotes (for input). In order to be input from a file, they should necessarily be stored under the json format.

In [18]:
!cora cat 'a+b'
a+b
In [19]:
!cora -Ojson cat 'a+b' > e.json
!cora -Ijson cat  e.json       
a+b

Further features of Cora, in particular the computations with weighted automata and with transducers, will be treated in other notebooks and pieces of documentation.