Rational expressions¶


In [1]:
# We disable autosave for technical reasons.
# Replace 0 by 120 in next line to restore default.
%autosave 0
Autosave disabled
In [2]:
import awalipy # If import fails, check that 
               # Python version used as Jupyter
               # kernel matches the one
               # Awalipy was compiled with.
[Warning] The python module awalipy relies on compilation executed "on-the-fly" depending on the context (type of weights, of labels, etc.). As a result, the very first call to a given function in a given context may take up to one minute. 

Creating a RatExp¶

When parsing a rational expression operator precedence is : star > concatenation > union . In other words,

  • a+(b*) = a+b* != (a+b)*
  • a(b*) = ab* != (ab)*
  • a+(bc) = a+bc != (a+b)c
In [3]:
e = awalipy.RatExp("(a+bc)c*(ab)*")
e
Out[3]:
(a+bc)c*(ab)*

By default, the alphabet of a rational expression is the set of all letters appearing in it. However the alphabet may be increased artifically as follows.

In [4]:
f = awalipy.RatExp("(a+b)(c*+a)*", alphabet="abcd")
f
Out[4]:
(a+b)(c*+a)*

Displaying a rational expression as a tree.

In [5]:
e.display()
%3 I2 2 [.] I2->2 3 [.] 2->3 . 14 [*] 2->14 .. 4 [+] 3->4 . 10 [*] 3->10 .. 5 a 4->5 . 6 [.] 4->6 .. 7 b 6->7 . 8 c 6->8 .. 9 c 10->9 11 [.] 12 a 11->12 . 13 b 11->13 .. 14->11

Union¶

In [6]:
e+f
Out[6]:
(a+bc)c*(ab)*+(a+b)(c*+a)*
In [7]:
e+=e
e
Out[7]:
(a+bc)c*(ab)*+(a+bc)c*(ab)*

Concatenation¶

In [8]:
e^f
Out[8]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)((a+b)(c*+a)*)
In [9]:
e^="abc*"
e
Out[9]:
((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*)

Star¶

In [10]:
e.star()
Out[10]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*
In [11]:
e.star_here()
e
Out[11]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Star normal form and star height¶

In [12]:
e.star_height()
Out[12]:
2
In [13]:
e.star_normal_form()
Out[13]:
(((a+bc)c*(ab)*+(a+bc)c*(ab)*)(abc*))*

Expand¶

The method expand distribute union and concatenation as much as possible.

In [14]:
awalipy.RatExp("(a+bc)(d+e)(f+g)*").expand()
Out[14]:
ad(f+g)*+ae(f+g)*+bcd(f+g)*+bce(f+g)*

Expressions to automata¶

By default, awali uses the derived term algorithm.

In [15]:
A = e.exp_to_aut()
A.display()
%3 I6 6 $0 I6->6 F6 F7 2 $3 5 $1 2->5 c 3 $8 4 $6 3->4 a 8 $9 3->8 a 4->3 b 5->4 a 5->5 c 5->8 a 6->F6 6->2 b 6->5 a 7 $16 7->F7 7->2 b 7->5 a 7->7 c 8->7 b

The states of A are indeed all the derived expressions of e. It may be displayed by setting to True the optional argument history.

In [16]:
A.display(horizontal=False)
%3 I6 6 $0 I6->6 F6 F7 2 $3 5 $1 2->5 c 3 $8 4 $6 3->4 a 8 $9 3->8 a 4->3 b 5->4 a 5->5 c 5->8 a 6->F6 6->2 b 6->5 a 7 $16 7->F7 7->2 b 7->5 a 7->7 c 8->7 b

For convenience, one may give an expression to the constructor of an automaton.

In [17]:
A = awalipy.Automaton(awalipy.RatExp("01*0*"))
A.display()
%3 I2 2 $0 I2->2 F3 F4 4 $1 2->4 0 3 $4 3->F3 3->3 0 4->F4 4->3 0 4->4 1

Awali implements multiple algorithms for transforming expressions to automata, such as thompson or standard

In [18]:
g = awalipy.RatExp("1*0")
g.exp_to_aut("thompson").display()
%3 I4 4 s1 I4->4 F7 2 s0 3 t0 2->3 1 3->2 \e 5 t1 3->5 \e 4->2 \e 4->5 \e 6 s2 5->6 \e 7 t2 6->7 0 7->F7
In [19]:
g.exp_to_aut("standard").display()
%3 I2 2 $0 I2->2 F5 3 $1 2->3 1 5 $3 2->5 0 3->3 1 3->5 0 5->F5
In [ ]:
 

Automata to expressions¶

In [20]:
SC = awalipy.load("slowcerny") # Lets us load some automaton
SC.display()
%3 I2 2 p I2->2 I3 3 q I3->3 I4 4 r I4->4 I5 5 s I5->5 I6 6 t I6->6 I7 7 u I7->7 F4 2->2 b 2->3 a 3->3 b 3->4 a 4->F4 4->2 a 4->5 b 5->4 b 5->6 a 6->6 b 6->7 a 7->4 b 7->5 a

Awalipy uses the classical state-elimination algorithm to compute an expression from an automaton.

To choose which state to eliminate first we use the min_inout_degree heuristics that is: at each step, the next eliminated state is the one with the minimal product of its in-degree by its out-degree.

In [21]:
SC_exp1 = SC.aut_to_exp()
SC_exp1
Out[21]:
(\e+(\e+b*a)b*a+(\e+b*a)b+(\e+(\e+b*a)a)(ab*aa)*(b+ab*ab))(ab*ab*a+b(ab*aa)*(b+ab*ab))*

To disable this heuristics, use the method "id_order": states are eliminated in increasing order of identifier.

In [22]:
SC_exp2 = SC.aut_to_exp(method="id_order")
SC_exp2
Out[22]:
(\e+b+b*(ab)+(\e+a+b*(aa))(ab*(aa))*(b+ab*(ab)))(b(ab*(aa))*(b+ab*(ab)))*+b*(a(b(ab*(aa))*(b+ab*(ab)))*)+(\e+(\e+b+b*(ab)+(\e+a+b*(aa))(ab*(aa))*(b+ab*(ab)))(b(ab*(aa))*(b+ab*(ab)))*a+b*(a(b(ab*(aa))*(b+ab*(ab)))*a))(b+ab*(a(b(ab*(aa))*(b+ab*(ab)))*a))*(ab*(a(b(ab*(aa))*(b+ab*(ab)))*))
In [ ]:
 
In [ ]:
 

Weighted rational expression¶

Weights must be put between "<>" and weights takes precedence over other operators:

  • <-1>a* = (<-1>a)* != <-1>(a*)
  • <-1>ab = (<-1>a)b != <-1>(ab)
  • <-1>a+b = (<-1>a)+b != <-1>(a+b)

The weighset must be given as a second argument at construction.

In [23]:
h = awalipy.RatExp("(<1>a*+<-1>(b*))","Z")
h
Out[23]:
a*+<-1>(b*)
In [24]:
h.display()
%3 I2 2 [+] I2->2 4 [*] 2->4 . 6 <-1>[*] 2->6 .. 3 a 4->3 5 b 6->5


For the sake of convenience, a weight alone (ie. "< -1>") is considered as a valid representation of the word epsilon with the given weight (ie. "< -1>\e").

In [25]:
awalipy.RatExp("<-2>","Z")
Out[25]:
<-2>\e


Union, concatenation and star works in the same way for weighted rational expressions.

In [26]:
i = h ^ h + ("<-1>" ^ h).star()
i
Out[26]:
(a*+<-1>(b*))(a*+<-1>(b*)+<-1>(a*+<-1>(b*))*)

Weighted expression to weighted automaton¶

For aut_to_exp or standard to work, the rational expression needs to be valid. An expression is valid if, in every sub-expression, the weight of $\epsilon$ is well defined. For instance the expression *(< 2 >\e)** is not valid (with weightset $(\mathbb{Z},+,\times$))

In [27]:
i.is_valid()
Out[27]:
True
In [28]:
i.exp_to_aut().display()
%3 I2 2 $0 I2->2 F3 F4 F5 F6 F7 F8 5 $1 2->5 a 6 $3 2->6 <-1>b 3 $6 3->F3 3->3 b 4 $4 4->F4 4->4 a 5->F5 5->3 <-1>b 5->4 a 5->5 a 7 $9 5->7 b 8 $7 5->8 <-1>a 6->F6 6->3 <-1>b 6->4 a 6->6 b 6->7 b 6->8 <-1>a 7->F7 7->7 <2>b 7->8 <-1>a 8->F8 8->7 b

The method thompson() is not suitable for weighted expressions.

Indeed, let us consider the following valid expression g:

In [29]:
g = awalipy.RatExp("(<1>(a*)+<-1>(b*))*","Z")
g.is_valid()
Out[29]:
True
In [30]:
G = g.exp_to_aut(method="thompson")
G.display(horizontal=False)
%3 I14 14 s5 I14->14 F15 2 s4 6 s1 2->6 \e 12 $10 2->12 \e 3 t4 3->2 \e 15 t5 3->15 \e 4 s0 5 t0 4->5 a 5->4 \e 7 t1 5->7 \e 6->4 \e 6->7 \e 7->3 \e 8 s2 9 t2 8->9 b 9->8 \e 11 t3 9->11 \e 10 s3 10->8 \e 10->11 \e 13 $11 11->13 \e 12->10 <-1>\e 13->3 \e 14->2 \e 14->15 \e 15->F15

In this case, thompson produces an automaton that is not valid.

In [31]:
G.is_valid()
Out[31]:
False

Other functions¶

The method constant_term gives the weight of epsilon

In [32]:
j = awalipy.RatExp("<3>((<1/4>(a*)+<1/4>(b*))*)<2>","Q")
j
Out[32]:
<3>((<1/4>(a*)+<1/4>(b*))*)<2>
In [33]:
j.constant_term()
Out[33]:
'12'

In [34]:
j.get_weightset()
Out[34]:
Q

Decomposing RatExp's¶

Kind of RatExp's¶

The method exp.get_kind() gives the top level kind of a RatExp exp. (For instance, below, the top level operator of expression j is the Kleene star.)

In [35]:
j.get_kind()
Out[35]:
STAR~5

The method get_kind() returns an object of RatExpKind, which is a sort of enum. The different instances are accessible as follows.

In [36]:
awalipy.RatExp.ZERO
Out[36]:
ZERO~0
In [37]:
awalipy.RatExp.ONE
Out[37]:
ONE~1
In [38]:
awalipy.RatExp.ATOM
Out[38]:
ATOM~2
In [39]:
awalipy.RatExp.SUM
Out[39]:
SUM~3
In [40]:
awalipy.RatExp.PROD
Out[40]:
PROD~4
In [41]:
awalipy.RatExp.STAR
Out[41]:
STAR~5

The list of all possible instances of RatExpKind can be accessed via RatExpKind.instance.

In [42]:
awalipy.RatExpKind.instances
Out[42]:
[ZERO~0, ONE~1, ATOM~2, SUM~3, PROD~4, STAR~5]

Object of type RatExpKind can be converted to or built from their integer value or their string value as follows.

In [43]:
awalipy.RatExpKind.of["STAR"]
Out[43]:
STAR~5
In [44]:
str(awalipy.RatExp.PROD)
Out[44]:
'PROD'
In [45]:
awalipy.RatExpKind.of[2]
Out[45]:
ATOM~2
In [46]:
int(awalipy.RatExp.ATOM)
Out[46]:
2

Sub-expression¶

The method ratexp.children() gives the sub-expressions of a RatExp ratexp.

In [47]:
j.children()
Out[47]:
[<1/4>(a*)+<1/4>(b*)]

In the case where the expression is a RatExp.ATOM, then children() gives the held label as string.

In [48]:
k = awalipy.RatExp('a')
k.get_kind()
Out[48]:
ATOM~2
In [49]:
k.children()
Out[49]:
['a']
In [50]:
l = awalipy.RatExp('<2>a','Z')
l.get_kind()
Out[50]:
ATOM~2
In [51]:
l.children()
Out[51]:
['a']

Weights¶

The method ratexp.weight() gives the left and right weights of a RatExp ratexp.

In [52]:
j.display()
%3 I7 7 <3>[*]<2> I7->7 2 [+] 4 <1/4>[*] 2->4 . 6 <1/4>[*] 2->6 .. 3 a 4->3 5 b 6->5 7->2
In [53]:
j.weight()
Out[53]:
['3', '2']
In [54]:
k.display()
%3 I2 2 a I2->2
In [55]:
k.weight()
Out[55]:
['true', 'true']
In [56]:
l.display()
%3 I2 2 <2>a I2->2
In [57]:
l.weight()
Out[57]:
['2', '1']

Unpack a RatExp¶

The method ratexp.content() gives the full top-level content of a RatExp exp.

In [58]:
j.display()
j
%3 I7 7 <3>[*]<2> I7->7 2 [+] 4 <1/4>[*] 2->4 . 6 <1/4>[*] 2->6 .. 3 a 4->3 5 b 6->5 7->2
Out[58]:
<3>((<1/4>(a*)+<1/4>(b*))*)<2>
In [59]:
j.content()
Out[59]:
[STAR~5, '3', <1/4>(a*)+<1/4>(b*), '2']
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: