Rational expressions


In [1]:
# We disable autosave for technical reasons.
# Replace 0 by 120 in next line to restore default.
%autosave 0
Autosave disabled
In [2]:
import awalipy # If import fails, check that 
               # Python version used as Jupyter
               # kernel matches the one
               # Awalipy was compiled with.
[Warning] The python module awalipy relies on compilation executed "on-the-fly" depending on the context (type of weights, of labels, etc.). As a result, the very first call to a given function in a given context may take up to 10 seconds. 

Creating a RatExp

When parsing a rational expression operator precedence is : star > concatenation > union . In other words,

  • a+(b*) = a+b* != (a+b)*
  • a(b*) = ab* != (ab)*
  • a+(bc) = a+bc != (a+b)c
In [3]:
e = awalipy.RatExp("(a+bc)c*(ab)*")
e
Out[3]:
(a+bc)c*(ab)*

By default, the alphabet of a rational expression is the set of all letters appearing in it. However the alphabet may be increased artifically as follows.

In [4]:
f = awalipy.RatExp("(a+b)(c*+a)*", alphabet="abcd")
f
Out[4]:
(a+b)(c*+a)*

Displaying a rational expression as a tree.

In [5]:
e.display()
%3 I2 2 [.] I2->2 3 [.] 2->3 . 14 [*] 2->14 .. 4 [+] 3->4 . 10 [*] 3->10 .. 5 a 4->5 . 6 [.] 4->6 .. 7 b 6->7 . 8 c 6->8 .. 9 c 10->9 11 [.] 12 a 11->12 . 13 b 11->13 .. 14->11

Union

In [6]:
e+f
Out[6]:
(a+bc)c*(ab)*+(a+b)(c*+a)*
In [7]:
e+=e
e
Out[7]:
(a+bc)c*(ab)*+(a+bc)c*(ab)*

Concatenation

In [8]:
e^f
Out[8]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)((a+b)(c*+a)*)
In [9]:
e^="abc*"
e
Out[9]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*)

Star

In [10]:
e.star()
Out[10]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*
In [11]:
e.star_here()
e
Out[11]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Star normal form and star height

In [12]:
e.star_height()
Out[12]:
2
In [13]:
e.star_normal_form()
Out[13]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Expand

The method expand distribute union and concatenation as much as possible.

In [14]:
awalipy.RatExp("(a+bc)(d+e)(f+g)*").expand()
Out[14]:
ad(f+g)*+ae(f+g)*+bcd(f+g)*+bce(f+g)*

Expressions to automata

By default, awali uses the derived term algorithm.

In [15]:
A = e.exp_to_aut()
A.display()
%3 I6 6 $0 I6->6 F6 F7 2 $3 5 $1 2->5 c 3 $8 4 $6 3->4 a 8 $9 3->8 a 4->3 b 5->4 a 5->5 c 5->8 a 6->F6 6->2 b 6->5 a 7 $16 7->F7 7->2 b 7->5 a 7->7 c 8->7 b

The states of A are indeed all the derived expressions of e. It may be displayed by setting to True the optional argument history.

In [16]:
A.display(horizontal=False,history=True)
%3 I6 6 {$0} I6->6 F6 F7 2 {$3, $10} 5 {$1, $4, $5, $7, $11, $12} 2->5 c 3 {$8, $15} 4 {$6, $13} 3->4 a 8 {$9} 3->8 a 4->3 b 5->4 a 5->5 c 5->8 a 6->F6 6->2 b 6->5 a 7 {$16, $17} 7->F7 7->2 b 7->5 a 7->7 c 8->7 b

For convenience, one may give an expression to the constructor of an automaton.

In [17]:
A = awalipy.Automaton(awalipy.RatExp("01*0*"))
A.display()
%3 I2 2 $0 I2->2 F3 F4 4 $1 2->4 0 3 $4 3->F3 3->3 0 4->F4 4->3 0 4->4 1

Awali implements multiple algorithms for transforming expressions to automata, such as thompson or standard

In [18]:
g = awalipy.RatExp("1*0")
g.exp_to_aut("thompson").display()
%3 I4 4 s1 I4->4 F7 2 s0 3 t0 2->3 1 3->2 \e 5 t1 3->5 \e 4->2 \e 4->5 \e 6 s2 5->6 \e 7 t2 6->7 0 7->F7
In [19]:
g.exp_to_aut("standard").display()
%3 I2 2 $0 I2->2 F5 3 $1 2->3 1 5 $3 2->5 0 3->3 1 3->5 0 5->F5
In [ ]:
 

Weighted rational expression

Weights must be put between "<>" and weights takes precedence over other operators:

  • <-1>a* = (<-1>a)* != <-1>(a*)
  • <-1>ab = (<-1>a)b != <-1>(ab)
  • <-1>a+b = (<-1>a)+b != <-1>(a+b)

The weighset must be given as a second argument at construction.

In [20]:
h = awalipy.RatExp("(<1>a*+<-1>(b*))","Z")
h
Out[20]:
a*+<-1>(b*)
In [21]:
h.display()
%3 I2 2 [+] I2->2 4 [*] 2->4 . 6 <-1>[*] 2->6 .. 3 a 4->3 5 b 6->5


For the sake of convenience, a weight alone (ie. "<-1>") is considered as a valid representation of the word epsilon with the given weight (ie. "<-1>\e").

In [22]:
awalipy.RatExp("<-2>","Z")
Out[22]:
<-2>\e


Union, concatenation and star works in the same way for weighted rational expressions.

In [23]:
i = h ^ h + ("<-1>" ^ h).star()
i
Out[23]:
(a*+<-1>(b*))(a*+<-1>(b*)+<-1>(a*+<-1>(b*))*)

Weighted expression to weighted automaton

For aut_to_exp or standard to work, the rational expression needs to be valid. An expression is valid if, in every sub-expression, the weight of $\epsilon$ is well defined. For instance the expression *(< 2 >\e)** is not valid (with weightset $(\mathbb{Z},+,\times$))

In [24]:
i.is_valid()
Out[24]:
True
In [25]:
i.exp_to_aut().display()
%3 I2 2 $0 I2->2 F3 F4 F5 F6 F7 F8 5 $1 2->5 a 6 $3 2->6 <-1>b 3 $6 3->F3 3->3 b 4 $4 4->F4 4->4 a 5->F5 5->3 <-1>b 5->4 a 5->5 a 7 $9 5->7 b 8 $7 5->8 <-1>a 6->F6 6->3 <-1>b 6->4 a 6->6 b 6->7 b 6->8 <-1>a 7->F7 7->7 <2>b 7->8 <-1>a 8->F8 8->7 b

The method thompson() is not suitable for weighted expressions.

Indeed, let us consider the following valid expression g:

In [26]:
g = awalipy.RatExp("(<1>(a*)+<-1>(b*))*","Z")
g.is_valid()
Out[26]:
True
In [27]:
G = g.exp_to_aut(method="thompson")
G.display(horizontal=False)
%3 I12 12 s5 I12->12 F13 2 s4 6 s1 2->6 \e 10 s3 2->10 \e 3 t4 3->2 \e 13 t5 3->13 \e 4 s0 5 t0 4->5 a 5->4 \e 7 t1 5->7 \e 6->4 \e 6->7 \e 7->3 \e 8 s2 9 t2 8->9 b 9->8 \e 11 t3 9->11 \e 10->8 <-1>\e 10->11 <-1>\e 11->3 \e 12->2 \e 12->13 \e 13->F13

In this case, thompson produces an automaton that is not valid.

In [28]:
G.is_valid()
Out[28]:
False

Other functions

The method constant_term gives the weight of epsilon

In [29]:
j = awalipy.RatExp("<3>((<1/4>(a*)+<1/4>(b*))*)<2>","Q")
j
Out[29]:
<3>((<1/4>(a*)+<1/4>(b*))*)<2>
In [30]:
j.constant_term()
Out[30]:
'12'

In [31]:
j.get_weightset()
Out[31]:
Q

Decomposing RatExp's

Kind of RatExp's

The method exp.get_kind() gives the top level kind of a RatExp exp. (For instance, below, the top level operator of expression j is the Kleene star.)

In [32]:
j.get_kind()
Out[32]:
STAR~5

The method get_kind() returns an object of RatExpKind, which is a sort of enum. The different instances are accessible as follows.

In [33]:
awalipy.RatExp.ZERO
Out[33]:
ZERO~0
In [34]:
awalipy.RatExp.ONE
Out[34]:
ONE~1
In [35]:
awalipy.RatExp.ATOM
Out[35]:
ATOM~2
In [36]:
awalipy.RatExp.SUM
Out[36]:
SUM~3
In [37]:
awalipy.RatExp.PROD
Out[37]:
PROD~4
In [38]:
awalipy.RatExp.STAR
Out[38]:
STAR~5

The list of all possible instances of RatExpKind can be accessed via RatExpKind.instance.

In [39]:
awalipy.RatExpKind.instances
Out[39]:
[ZERO~0, ONE~1, ATOM~2, SUM~3, PROD~4, STAR~5]

Object of type RatExpKind can be converted to or built from their integer value or their string value as follows.

In [40]:
awalipy.RatExpKind.of["STAR"]
Out[40]:
STAR~5
In [41]:
str(awalipy.RatExp.PROD)
Out[41]:
'PROD'
In [42]:
awalipy.RatExpKind.of[2]
Out[42]:
ATOM~2
In [43]:
int(awalipy.RatExp.ATOM)
Out[43]:
2

Sub-expression

The method ratexp.children() gives the sub-expressions of a RatExp ratexp.

In [44]:
j.children()
Out[44]:
[<1/4>(a*)+<1/4>(b*)]

In the case where the expression is a RatExp.ATOM, then children() gives the held label as string.

In [45]:
k = awalipy.RatExp('a')
k.get_kind()
Out[45]:
ATOM~2
In [46]:
k.children()
Out[46]:
['a']
In [47]:
l = awalipy.RatExp('<2>a','Z')
l.get_kind()
Out[47]:
ATOM~2
In [48]:
l.children()
Out[48]:
['a']

Weights

The method ratexp.weight() gives the left and right weights of a RatExp ratexp.

In [49]:
j.display()
%3 I7 7 <3>[*]<2> I7->7 2 [+] 4 <1/4>[*] 2->4 . 6 <1/4>[*] 2->6 .. 3 a 4->3 5 b 6->5 7->2
In [50]:
j.weight()
Out[50]:
['3', '2']
In [51]:
k.display()
%3 I2 2 a I2->2
In [52]:
k.weight()
Out[52]:
['1', '1']
In [53]:
l.display()
%3 I2 2 <2>a I2->2
In [54]:
l.weight()
Out[54]:
['2', '1']

Unpack a RatExp

The method ratexp.content() gives the full top-level content of a RatExp exp.

In [55]:
j.display()
j
%3 I7 7 <3>[*]<2> I7->7 2 [+] 4 <1/4>[*] 2->4 . 6 <1/4>[*] 2->6 .. 3 a 4->3 5 b 6->5 7->2
Out[55]:
<3>((<1/4>(a*)+<1/4>(b*))*)<2>
In [56]:
j.content()
Out[56]:
[STAR~5, '3', <1/4>(a*)+<1/4>(b*), '2']
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: