Rational expressions


In [1]:
# We disable autosave for technical reasons.
# Replace 0 by 120 in next line to restore default.
%autosave 0
Autosave disabled
In [2]:
import awalipy # If import fails, check that 
               # Python version used as Jupyter
               # kernel matches the one
               # Awalipy was compiled with.
[Warning] The python module awalipy relies on compilation executed "on-the-fly" depending on the context (type of weights, of labels, etc.). As a result, the very first call to a given function in a given context may take up to 10 seconds. 

Creating a RatExp

When parsing a rational expression operator precedence is : star > concatenation > union . In other words,

  • a+(b*) = a+b* != (a+b)*
  • a(b*) = ab* != (ab)*
  • a+(bc) = a+bc != (a+b)c
In [3]:
e = awalipy.RatExp("(a+bc)c*(ab)*")
e
Out[3]:
(a+bc)c*(ab)*

By default, the alphabet of a rational expression is the set of all letters appearing in it. However the alphabet may be increased artifically as follows.

In [4]:
f = awalipy.RatExp("(a+b)(c*+a)*", alphabet="abcd")
f
Out[4]:
(a+b)(c*+a)*

Displaying a rational expression as a tree.

In [5]:
e.display()
%3 I2 2 [.] I2->2 3 [.] 2->3 . 14 [*] 2->14 .. 4 [+] 3->4 . 10 [*] 3->10 .. 5 a 4->5 . 6 [.] 4->6 .. 7 b 6->7 . 8 c 6->8 .. 9 c 10->9 11 [.] 12 a 11->12 . 13 b 11->13 .. 14->11

Union

In [6]:
e+f
Out[6]:
(a+bc)c*(ab)*+(a+b)(c*+a)*
In [7]:
e+=e
e
Out[7]:
(a+bc)c*(ab)*+(a+bc)c*(ab)*

Concatenation

In [8]:
e^f
Out[8]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)((a+b)(c*+a)*)
In [9]:
e^="abc*"
e
Out[9]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*)

Star

In [10]:
e.star()
Out[10]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*
In [11]:
e.star_here()
e
Out[11]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Star normal form and star height

In [12]:
e.star_height()
Out[12]:
2
In [13]:
e.star_normal_form()
Out[13]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Expand

The method expand distribute union and concatenation as much as possible.

In [14]:
awalipy.RatExp("(a+bc)(d+e)(f+g)*").expand()
Out[14]:
ad(f+g)*+ae(f+g)*+bcd(f+g)*+bce(f+g)*

Expressions to automata

By default, awali uses the derived term algorithm.

In [15]:
A = e.exp_to_aut()
A.display()
%3 I2 2 s0 I2->2 F2 F8 2->F2 3 s1 2->3 a 4 s2 2->4 b 3->3 c 5 s3 3->5 a 6 s4 3->6 a 4->3 c 8 s6 5->8 b 7 s5 6->7 b 7->5 a 7->6 a 8->F8 8->3 a 8->4 b 8->8 c

The states of A are indeed all the derived expressions of e. It may be displayed by setting to True the optional argument history.

In [16]:
A.display(horizontal=False,history=True)
%3 I2 2 (((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* I2->2 F2 F8 2->F2 3 c*(ab)*(abc*)(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 2->3 a 4 cc*(ab)*(abc*)(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 2->4 b 3->3 c 5 bc*(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 3->5 a 6 b(ab)*(abc*)(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 3->6 a 4->3 c 8 c*(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 5->8 b 7 (ab)*(abc*)(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))* 6->7 b 7->5 a 7->6 a 8->F8 8->3 a 8->4 b 8->8 c

For convenience, one may give an expression to the constructor of an automaton. Derived term is called.

In [17]:
A = awalipy.Automaton(awalipy.RatExp("01*0*"))
A.display()
%3 I2 2 s0 I2->2 F3 F4 3 s1 2->3 0 3->F3 3->3 1 4 s2 3->4 0 4->F4 4->4 0

Awali implements other algorithms for transforming expressions to automata, such as thompson or standard

In [18]:
g = awalipy.RatExp("1*0")
g.thompson().display()
%3 I4 4 s2 I4->4 F7 2 s0 3 s1 2->3 1 3->2 \e 5 s3 3->5 \e 4->2 \e 4->5 \e 6 s4 5->6 \e 7 s5 6->7 0 7->F7
In [19]:
g.standard().display()
%3 I2 2 s0 I2->2 F5 3 s1 2->3 1 5 s3 2->5 0 3->3 1 3->5 0 5->F5

Weighted rational expression

Weights must be put between "<>" and weights takes precedence over other operators:

  • <-1>a* = (<-1>a)* != <-1>(a*)
  • <-1>ab = (<-1>a)b != <-1>(ab)
  • <-1>a+b = (<-1>a)+b != <-1>(a+b)

The weighset must be given as a second argument at creation.

In [20]:
h = awalipy.RatExp("(<1>a*+<-1>(b*))","Z")
h
Out[20]:
a*+<-1>(b*)
In [21]:
h.display()
%3 I2 2 [+] I2->2 4 [*] 2->4 . 6 <-1>[*] 2->6 .. 3 a 4->3 5 b 6->5


For the sake of convenience, a weight alone (ie. "<-1>") is considered as a valid representation of the word epsilon with the given weight (ie. "<-1>\e").

In [22]:
awalipy.RatExp("<-2>","Z")
Out[22]:
<-2>\e


Union, concatenation and star works in the same way for weighted rational expressions.

In [23]:
i = h ^ h + ("<-1>" ^ h).star()
i
Out[23]:
(a*+<-1>(b*))(a*+<-1>(b*)+<-1>(a*+<-1>(b*))*)

Weighted expression to weighted automaton

For aut_to_exp or standard to work, the rational expression needs to be valid. An expression is valid if, in every sub-expression, the weight of $\epsilon$ is well defined. For instance the expression *(< 2 >\e)** is not valid (with weightset $(\mathbb{Z},+,\times$))

In [24]:
i.is_valid()
Out[24]:
True
In [25]:
i.exp_to_aut().display()
%3 I2 2 s0 I2->2 F3 F4 F5 F6 F7 F8 3 s1 2->3 a 4 s2 2->4 <-1>b 3->F3 3->3 a 5 s3 3->5 a 6 s4 3->6 <-1>a 7 s5 3->7 <-1>b 8 s6 3->8 b 4->F4 4->4 b 4->5 a 4->6 <-1>a 4->7 <-1>b 4->8 b 5->F5 5->5 a 6->F6 6->8 b 7->F7 7->7 b 8->F8 8->6 <-1>a 8->8 <2>b

The method thompson() is not suitable for weighted expressions.

Indeed, let us consider the following valid expression g:

In [26]:
g = awalipy.RatExp("(<1>(a*)+<-1>(b*))*","Z")
g.is_valid()
Out[26]:
True
In [27]:
G = g.thompson()
G.display(horizontal=False)
%3 I12 12 s10 I12->12 F13 2 s0 6 s4 2->6 \e 10 s8 2->10 \e 3 s1 3->2 \e 13 s11 3->13 \e 4 s2 5 s3 4->5 a 5->4 \e 7 s5 5->7 \e 6->4 \e 6->7 \e 7->3 \e 8 s6 9 s7 8->9 b 9->8 \e 11 s9 9->11 \e 10->8 <-1>\e 10->11 <-1>\e 11->3 \e 12->2 \e 12->13 \e 13->F13

In this case, thompson produces an automaton that is not valid.

In [28]:
G.is_valid()
Out[28]:
False

Other functions

The method constant_term gives the weight of epsilon

In [29]:
j = awalipy.RatExp("(<1/4>(a*)+<1/4>(b*))*","Q")
j
Out[29]:
(<1/4>(a*)+<1/4>(b*))*
In [30]:
j.constant_term()
Out[30]:
'2'

In [31]:
j.get_weightset()
Out[31]:
Q