Content area

Abstract

In this paper, one-wheel and two-wheel concatenations of circular words and their languages are investigated. One-wheel concatenation is an operation that is commutative but not associative, while two-wheel concatenation is associative but not commutative. Moreover, two-wheel concatenation may produce languages that are not languages of circular words. We define two classes of regular languages of circular words based on finite automata: in a weakly accepted circular word language, at least one conjugate of each word is accepted by the automaton; in contrast, a strongly accepted language consists of words for which all conjugates are accepted. Weakly accepted circular word languages REGw, in fact, are regular languages that are the same as their cyclic permutations. Strongly accepted circular word languages, REGs, having words with the property that all their conjugates are also in the language, are also regular. We prove that REGw and REGs coincide. We also provide regular-like expressions for these languages. Closure properties of this class are also investigated.

Full text

Turn on search term navigation

1. Introduction

On the one hand, a relatively new area of computer science called combinatorics on words focuses on the structure of finite and infinite sequences of symbols or words. It is closely related to many different branches of mathematics, including algebra, number theory, game theory and others. The first book on the topic was produced by a number of authors using the pen name M. Lothaire [1], but the majority of researchers believe that A. Thue’s writings are among the earliest works to be published in this field [2,3]. Since then, numerous computer science applications have been found, including those for string matching, data compression, bioinformatics, etc., such as the De Bruijn sequences [4], just to name an important example.

On the other hand, circular words differ from linear words and open up some intriguing new perspectives. Circular words are often referred to as necklaces or cyclic words [5]. Similar sequences can be seen in nature; for instance, some bacteria’s DNA sequences resemble a necklace. Circular words, in a sense, can be seen as strongly periodic discrete functions [6,7]. From a mathematical point of view, in many places, the conjugate class of a word is considered as its cyclic or circular version [8] by highlighting its main difference from usual words, i.e., that there are no well-defined starting and ending points of the sequence. Thus, they can be used to model various objects and phenomena from formal, mathematical and computing points of view.

While the majority of research focuses on “normal” (linear) words, there are also some studies about circular words. The reader can refer to [9] for information on Weinbaum factorizations. D. Nowotka’s dissertation [10] addresses unbordered conjugates of words in Chapter 4. Moreover, there are several applications to integer sequences in [11,12]. Mathematicians are particularly interested in cyclic (or circular) codes [13], which are connected to the concept of circular words. This demonstrates that circular words play a significant role in various places in computer science and mathematics.

Finite automata and regular expressions are well known and well applicable tools in various parts of science and engineering. They are important from both theoretical (mathematical and computer science) and practical points of view. One of our aims is to connect these concepts to circular words to enable the use of these tools for circular words and their languages. Another important aim is to consider the concatenation operation and to find a variant that may fit better for circular words than the usual operation fitting well to traditional, linear words.

In Section 2, we recall the fundamental concepts and notions of formal languages, combinatorics on words (both linear and circular) and finite automata. In Section 3, we provide the definitions of our new concepts, the regular languages of weakly accepted and strongly accepted circular words, REGw and REGs. We prove the equality of these two classes, and their closure properties under various operations are also studied. Moreover, we give a regular-expression-style description of these languages. Conclusions and open questions are provided at the end of the paper.

2. Preliminaries

In this section, we recall some basic concepts and present our notations. All concepts that are not detailed here are from standard textbooks, e.g., [5,14,15]. In this paper, the set of positive integers is denoted by N, and N0 denotes the set of non-negative integers. A nonempty set of symbols (or letters) is called an alphabet and denoted by Σ. A (linear) word or string is a finite sequence of the elements of Σ (we may use the word linear when we contrast these words with the topic of this paper, i.e., cyclic or circular words, which are defined later in this section). We denote with Σ* the set of all words over Σ. The length of a word w is the number of symbols in w and is denoted by w. An empty word, denoted by λ, is a unique word with a length of zero. The concatenation of two words u and v is u·v (we can omit the dot between u and v). A word vΣ* is a factor of wΣ* if there exist two words x, y Σ*, such that w=xvy. However, if x=λ (resp. y=λ), then v becomes a prefix (resp. suffix) of w. The ith letter of wΣ* is denoted by ai for i1, , w. The reversal of a word w=a1a2an of length n is denoted by wR, and it is the word written backwards, that is, wR=anan1a1 (note that λR=λ). The powers of a word w are defined as w0=λ and wk+1=wwk for k0. The period of a word w=a1a2an is a positive p, such that ai=ai+p for all i=1,,wp. Two words x and y are conjugates if there exist words u, vΣ* such that x=uv and y=vu.

A language, denoted by L, over an alphabet Σ, is a finite or infinite set of words, i.e., LΣ*. Set theoretical operations and concatenation are applied for languages. We can also use the Kleene star and Kleene plus closure operations to define the languages L*=k=0Lk and L+=k=1Lk, where L0={λ}, L1=L, and Lk=Lk1·L. We also use L to denote the cardinality of L, which can also be infinite.

A circular word w, from a graphical point of view, is created by connecting the first letter of a linear word w to the last. Obviously, a circular word has neither a beginning nor an end, as shown in Figure 1. Thus, a circular word of w can be seen as, and will be represented by, the set of all conjugates of w or, equivalently, all the cyclic shifts (cyclic permutations) of w; we may call this representation the linearization of w.

(1)w=vvis a conjugate of w

As a special case, the empty word can also be seen as the empty circular word (to emphasize its usage in this manner, we may also write it as λ). Notice that even if a circular word is a set, the length of each of its words is the same; thus, the notation w is correct for the length of them. On the other hand, w as a set has cardinality w, which is at most w if w is not the empty word. For the empty word, λ=1 and λ=0. In [6,7], the periodic properties of circular words were investigated. A period is called a strong period if it is a period of each element of the conjugate class w of the word. On the other hand, weak periods are periods that belong to at least one of the conjugates. In this paper, we also use the weak and strong adjectives but for acceptance. Thus, let us briefly recall the concept of finite automata.

A finite automaton is a five-tuple A=(Q,Σ,q0,δ,F), where Q is the finite set of states, Σ is the input alphabet, q0Q is the initial state, FQ is the set of final (or accepting) states and δ :Q× Σλ 2Q is the transition function for non-deterministic finite automata (NFA) with allowed λ-transitions, while δ:Q×ΣQ is the transition function for deterministic finite automata (DFA). The accepted language of A is denoted by L(A). A language is called regular if it can be recognized by a non-deterministic or deterministic finite automaton. Regular languages can also be described by regular expressions with the three regular operations, union +, concatenation · and Kleene star *.

In this paper, we work with the linearization of languages of circular words, i.e., with languages with the property that wL implies that wL. If this property holds, then we say that L is a language of circular words.

3. Regular Languages of Circular Words

We start this section by the definitions of regular classes of circular word languages, and we define them based on finite automata. Thus, let us assume that a finite automaton A is given. Now, for a circular word w, if at least one of its elements is accepted, then we say w is weakly accepted by A. On the other hand, if all elements of w are accepted, then it is strongly accepted. Based on these, we define our language classes in a formal way.

Definition 1.

Let A be a finite automaton. A weakly accepts the language of circular words, where the language has every circular word for which at least one of its conjugates belongs to the language L(A). It is LwA=uzzLA.

Definition 2.

Let A be a finite automaton. A strongly accepts the language of circular words if the language has the circular words for which all their conjugates are in L(A). It is denoted by LsA=uzzLA.

Example 1.

Let A accept the language LA=ww ends with a 1 over the alphabet Σ=0,1. The word w=110 is not in L(A). Thus, wLsA. But wLwA since one of its conjugates is 101, which is in L(A). In this example, LsA=1+ (the language of words containing only 1’s), while LwA=(0+1)*1(0+1)* is the set of words having at least one 1.

Example 2.

Let automaton A accept LA=ww0*+1* over the alphabet Σ=0,1. Here, LsA=LwA=L(A).

Proposition 1.

For any given finite automaton A, the relation LsALwA holds.

Proof. 

For all words, uLsA implies uz, where zL(A). This means all the conjugates of u are in L(A). Therefore, uz where zL(A), and thus uLwA. □

Theorem 1.

Let A be a finite automaton. Then, LwA is a regular language.

Proof. 

As the class of regular languages is closed under cyclic shift (see e.g., [15,16,17]), one can create a finite automaton A such that LA=LwA by, e.g., the construction due to Maslov [18]. □

As the cyclic closure operation for languages plays a crucial role here, we define the following special notation:

Definition 3.

Let L denote the unary operation taking the cyclic closure of the language L in the argument, i.e., L=uvvuLwith somev,uΣ*.

Theorem 2.

For any regular language L, Ls is also a regular language.

Proof. 

Let a regular language L be given. Then, L denotes the closure of L under cyclic shift. Further, applying this closure operator for the complement L¯ of the language L, we obtain L¯, the language that contains the cyclic closures of all words that are not in L. In fact, L¯ contains each cyclic word with the property that at least one of its conjugates is not in L. Now, the complement of this language is L¯¯, the circular word language containing all circular words for which all of their conjugates are in L. However, on the one hand, by the definition of Ls, we have Ls=L¯¯. Finally, as the class of regular languages is closed under complement and also under cyclic shift, the resulted language is still regular. Thus, for each regular language L, Ls is also regular. □

In fact, we have a (De Morgan-style) duality relation between the languages Lw and Ls for each language L. The classes of regular languages of weakly accepted circular words and strongly accepted circular words are denoted by REGw and REGs, respectively. Thus, the previous results can be written as REGwREG and REGsREG. As it is obvious from the definitions, from our point of view, circular word languages are, in fact, languages that are closed under cyclic shift operation, i.e., the language L is the same as its cyclic closure.

4. Concatenations of Circular Words and Circular Word Languages

In this section we investigate various concatenation operations for circular words and for their languages. As circular words can be seen as sets (of words), we can use the usual concatenation of these sets; however, the result may not be a set (or a language) of circular words but a kind of sequence of them.

Definition 4.

Let x and y be two circular words; their two-wheel concatenation is the sequence of the circular words obtained by:

(2)xy=x·y=zz=uvwhereuxandvy.

Indeed, the output in this case is usually not a circular word. We refer to this operation as a “two-wheel concatenation” (see Figure 2), while the sequence of circular words, in general, may be obtained by some subsequent applications of the operation, i.e., with n applications, one can obtain a sequence of n+1 circular words. We shortly refer to such objects as sequences of circular words based on the picture of how one can imagine them: in Figure 2, a sequence of two circular words is shown, and such a sequence can be generalized to longer sequences, e.g., containing n circular words for any natural number n. Thus, actually, we may use this type of concatenation in general for sequences of circular words, when the sequences with one circular word are actually the circular words. Moreover, the empty word can be seen as a sequence of zero circular words or, alternatively, as a sequence of any number of empty circular words (i.e., by applying the operation on itself, the result is also the empty circular word, but it can also be seen as a longer sequence made by this word).

As we are interested in circular word languages, we also define a concatenation to obtain circular words.

Definition 5.

For any two circular words x and y, their one-wheel concatenation is:

(3) x y = w z z = u v   where   u x   and   v y

See also Figure 3. Note here that Figure 2 and Figure 3 show only particular cases. The result of the operation usually gives a set with all the possibilities, as we will also explain in Example 3 below. However, first, let us show a very important property of the one-wheel concatenation.

Proposition 2.

The one-wheel concatenation of two circular words is a circular word language.

Proof. 

As a circular word is the set of a word and all its conjugates, for any two circular words x and y, their one-wheel concatenation is the set of words formed by concatenating each conjugate of x with each conjugate of y, along with the conjugates of these concatenations. □

Example 3.

Let x=abc and y=01 be two linear words over the alphabet Σ={a,b,c,0,1}. Then, x·y={abc01,abc10,bca01,bca10,cab01,cab10} and

x y = { a b c 01 , b c 01 a , c 01 a b , 01 a b c , 1 a b c 0 , a b c 10 , b c 10 a , c 10 a b , 10 a b c , 0 a b c 1 , b c a 01 , c a 01 b , a 01 b c , 01 b c a , 1 b c a 0 , b c a 10 , c a 10 b , a 10 b c , 10 b c a , 0 b c a 1 , c a b 01 , a b 01 c , b 01 c a , 01 c a b , 1 c a b 0 ,   c a b 10 , a b 10 c , b 10 c a , 10 c a b , 0 c a b 1 }  

As the previous example shows, the one-wheel concatenation of two circular words may not be a circular word, but it is always a circular word language, as we have proven. We investigate later what are the necessary and sufficient conditions to have the same result for the two types of concatenations (in this case, two-wheel concatenation of the circular words also results in a circular word language).

Proposition 3.

The one-wheel concatenation of circular words is commutative but not associative.

Proof. 

In terms of commutativity, for any two circular words x and y,

xy=wzz=uv whereuxandvy=wz=zz=vu,uxandvy=yx

Consider the following example to prove the non-associativity: Let x=ab, y=1 and z=c be three words over the alphabet Σ={a,b,c,1}. We have:

(xy)z=ab1c,b1ca,1cab,cab1b1ac,1acb,acb1,cb1a1abc,abc1,bc1a,c1abba1c,a1cb,1cba,cba1a1bc,1bca,bca1,ca1b1bac,bac1,ac1b,c1ba andx(yz)=ab1c,b1ca,1cab,cab1abc1,bc1a,c1ab,1abcba1c,a1cb,1cba,cba1bac1,ac1b,c1ba,1bac

Therefore, xyzxyz. □

Proposition 4.

The unit element for the one-wheel concatenation of circular words is the empty circular word λ.

Proof. 

Obviously, for any circular word x, λx=x=xλ. □

One can also see that these two types of concatenations are not independent. Thus, let us investigate their relation first on circular words.

Lemma 1.

For x and y being any two circular words, x·yxy holds.

Proof. 

Let wx·y. Then, there exists two words u and v such that w=uv, where ux and vy. Eventually, w(uv) and, thus, wxy. □

Lemma 2.

For any two nonempty circular words x and y, the following holds:

(4) x · y x y x · y · x y

Proof. 

By Lemma 1, since x·yxy, the first part is obvious. On the other hand, by the definitions of concatenations, xy has all the words of x·y and their conjugates. As the number of conjugates is at most the length of the word, the second part follows. □

Theorem 3.

For any two circular words x and y, let n=max{x,1}, m=max{y,1} and p=max{xy,1}. Then, xynm, and xynmp.

Proof. 

The proof goes by cases.

  • If both circular words x and y are the empty word, then xy=xy=1, as each contains only the empty word.

  • If only one of the two circular words x and y is the empty word, then xy=xymax{x,y}.

  • If x,y{a}* for an aΣ, then xy=xy=1.

  • If xλ and yλ, then 1xyx·y=nm, and by Lemma 2, 1xyxy·xynmp. □

Theorem 4.

For any two circular words x and y, xy=xy if and only if one of the following conditions holds:

  1. x,y{ai}* for any aiΣ, or

  2. one of the circular words is λ, and the other can be any circular word.

Proof 

(Sufficient condition). For any aiΣ and p,qN0, if x={aip}andy={aiq}, then xy=xy={aip+q} (this equality can be λ for p=q=0). On the other hand, xλ=xλ=x and λy=λy=y.

(Necessary condition) Assume that xy=xy, i.e., xy is a circular word language as the one-wheel concatenation is commutative. Thus, the cases are as follows:

  • xy=xy=1 implies x,y{ai}* (x and y can be any or both λ ).

  • If xy=xy2, then there are at least two different letters ai,ajΣ occurring in xy. Let us assume, to the contrary, that none of x and y is λ. Then, there are the following three subcases:

    • . If there exists a letter aiΣ that is in x but not in y, then ai cannot be a suffix of any word in xy. Thus, xy is not a circular word language, which contradicts the hypothesis.

    • . If there exists a letter aiΣ that is in y but not in x, then ai cannot be a prefix of any word in xy, another contradiction.

    • . Finally, if there exist at least two different letters ai,ajΣ in both x and y, then let n be the largest number such that ain is a factor of an element of x, and similarly, let m be the largest number such that that aim is a factor of an element of y. In this case, both n and m are positive. Then, the longest prefix of the form ai* in xy has a length n, but in xy, it has a length n+m, contradicting the equality of these two sets. □

Based on the defined concatenation operations on circular words (and on their sequences), we may also define the extensions of these operations to (circular word) languages. Obviously, the two-wheel concatenation of circular word languages is the same as the usual concatenation of them; however, the result may not be a circular word language. Further, we have the following:

Definition 6.

Let L1 and L2 be two languages of circular words. Their one-wheel concatenation, denoted by L1L2, is the union of all one-wheel concatenations of circular words stemming from the two languages:

(5) L 1 L 2 = x L 1 y L 2 x y

For the sake of completeness, we also formally define the two-wheel concatenation of two languages of circular words.

Definition 7.

Let L1 and L2 be two languages of circular words. Their two-wheel concatenation, denoted by L1·L2, is the union of all two-wheel concatenations of circular words stemming from the two languages:

(6) L 1 · L 2 = x L 1 y L 2 x · y

Example 4.

Let La={001,011,101,111} be the language of words ending with 1 with length 3, and let Lb be the language of all words ending with 1, both over the alphabet Σ=0,1. Then, the circular word languages associated to these languages, i.e., for automata A and B accepting La and Lb, respectively, are LwA={001,010,100,011,110,101,111}, which is the language of words having at least one 1 with length 3, and LwB, which is the language of all words having at least one 1.

The word u=000011LwALwB, since one of the conjugates of u is u=001100, which is the concatenation of two words x=001 and y=100, where xLwA and yLwB. However, uLwA·LwB, since u cannot be written in form of pq where px=(000) and qy=(011) as xLwA.

Considering another example where the word is v=010110. Here, vLwALwB, since v is the concatenation of two words x=010 and y=110, where xLwA and yLwB. Also, vLwA·LwB, as it is a result of a two-wheel concatenation: vx·y, where x=(010) and y=(110).

On the other hand, the word z=000001 exists neither in LwA·LwB nor in LwALwB, since none of its conjugates can be written as a concatenation of two words x and y, where xLwA and yLwB simultaneously.

We can notice from these examples that there are some words in LwALwB that are not in LwA·LwB. Thus, the equality of these two concatenations does not hold in general.

Corollary 1.

The one-wheel concatenation of circular word languages is commutative, but the two-wheel concatenation is generally not.

Proof. 

For one-wheel concatenation, it comes directly from Proposition 3, i.e., the fact that this operation is commutative on circular words. On the other hand, let L1={0,1}* and L2={a,b}*. Indeed, these languages are circular word languages, and both are in REGw. Let xL1 and yL2, such that none of them is the empty word; then, xyL1L2, but xyL2L1. Therefore, the two-wheel concatenation of these circular word languages is not commutative. □

Proposition 5.

The one-wheel concatenation of languages of circular words is distributive over the union and intersection.

Proof. 

Let L1, L2 and L3 be languages of circular words.

L1L2L3=xyxL1L2 and yL3=xy(xL1 or xL2) and yL3=xy(xL1 and yL3) or (xL2 and yL3)=xyxL1 and yL3xyxL2 and yL3=(L1L3)(L2L3)

On the other hand,

L1L2L3=xyxL1L2 and yL3=xy(xL1 and xL2) and yL3=xy(xL1 and yL3) and (xL2 and yL3)=xyxL1 and yL3xyxL2 and yL3=(L1L3)(L2L3)

These two properties are known as the left-distributive property of one-wheel concatenation over the union and intersection, respectively. As the one-wheel concatenation is commutative (Corollary 1), the right-distributive properties follow immediately:

L1L2L3=(L1L2)(L1L3) andL1L2L3=(L1L2)(L1L3).  

Corollary 2.

Let L1 and L2 be circular word languages; then,  L1·L2L1L2.

Proof. 

This is a direct consequence of Lemma 1. □

Proposition 6.

Let L1 and L2 be two languages of circular words such that x·y=xy for all xL1 and yL2. Then, L1·L2=L1L2.

Proof. 

Let uL1L2; then, there exists a word z such that uz=xy, where xL1 and yL2. If x·y=xy, then uxy. Therefore, uL1·L2 and L1L2L1·L2. The other direction comes from Corollary 2. Thus, the equality holds. □

In fact, the previous result can be applied for unary languages (languages over a one-letter alphabet) or languages where one of the languages could contain only the empty word.

Proposition 7.

The two-wheel concatenation of circular word languages is associative, but the one-wheel concatenation is generally not.

Proof. 

Let L1, L2 and L3 be three circular word languages. On the one hand, trivially, we have (L1·L2)·L3=L1·(L2·L3). On the other hand, in the proof of Proposition 3, taking, e.g., the three circular words as three circular word languages, the non-associativity result is inherited to languages L1L2L3L1L2L3. □

We recall here that in [17], a third type of concatenation was defined for circular words and circular word languages; the shuffle ◊ of two circular words x and y, denoted by xy, is:

(7)xy=uxvyuv

based on the shuffle operator of words. This operation was shown to be both commutative and associative on circular words. However, this would lead to more restricted classes of languages than the classes we intend to work with here.

Proposition 8.

The shuffle of circular words is the same as their one-wheel concatenation if one of the words has a length of at most 1.

Proof. 

If one of the words is the empty circular word, then trivially both operations result in the other circular word. Now, on the other hand, if one of the words has length 1, then both operations allow it to be inserted anywhere in the other circular word (i.e., into anywhere in any of its conjugates). □

The previous proposition may allow us to build circular word languages by one-wheel concatenation in such a way that the shuffle operation can also be simulated. However, in this paper, we use another approach.

5. Closure Properties and o-Regular Expressions

While some of the closure properties seem to be trivial, in many cases, one needs some extra care. For this reason, we start the section with some examples.

Example 5.

Consider the regular languages L=01,100,101, L=10,11 and L=11,100,110 over the alphabet Σ=0,1.

The corresponding circular word languages that are weakly accepted by the automata recognizing L, L and L, respectively, are Lw=01,10,100,001,010,101,011,110, Lw=10,01,11 and Lw={11,100,010,001,110,101,011}. Notice that the intersection of L and L is the empty set. However, LwLw={01,10}. Similarly, LL=100, LL=11,  but LwLw=100,010,001,110,101,011 and LwLw={11}; thus, LwLw=LLw holds only at the latter case, and it generally does not. Furthermore, the word 001 is in the complement of L while it is not in the complement of Lw; thus, the complement of L and Lw differs, and moreover, the complement of Lw is not the cyclic closure of the complement of L.

Theorem 5.

REGw is closed under union, intersection, complementation and reversal.

Proof. 

First of all, it is obvious that for any languages of circular words, their union, intersection, complement and reversal are also circular word languages, as if they contain a word, they must contain all of its conjugates. The rest of the proof is constructive for each operation. Let A and B be two finite automata that weakly accept LwA and LwB, respectively. The union proof is trivial, as we can construct an NFA A in the same way as generally carried out for the union of two regular languages, such that A weakly accepts the regular language of circular words LwALwB.

For the intersection and complement, however, based on Theorem 1, we first need to construct A1=(QA,Σ,q0A,δA,FA) and B1=(QB,Σ,q0B,δB,FB), two completely defined deterministic finite automata that accept the languages LwA and LwB, respectively, in the traditional manner. Now, for the intersection, we can construct A=(QA×QB,Σ, (q0A,q0B),δA,FA×FB), which accepts the language LwALwB in the traditional manner, and so, in a weak manner as well, where, for aΣ and q=(qA,qB)QA×QB, δAq,a=δAqA,a,δBqB,a. Thus, A accepts exactly those words and, thus, all circular words that are accepted by both automata A1 and B1 while simulating the work of these two automata in a parallel manner.

For the complement of the language LwA, we again use A1; as LA1=LwA, their complement is also matching, i.e., LA1¯=LwA¯. The regular language of weakly accepted circular words not accepted by A1 is accepted (and also weakly accepted) by A=(QA,Σ,q0A,δA,QA\FA).

For the reversal of the regular language LwB=LB1, the same construction works as for the reversal of an arbitrary regular language. E.g., based on B1, let B=(QBs,Σ, s,δB,q0B), which accepts the language LwR(B); this is simply achieved by reversing all the transitions of B, making the start state q0B of B1 to be the only accepting state of B and creating a new initial state sQB from which the accepting states of B1 can be reached by λ -transitions. Notice that in this case, a similar construction starting from B also suffices. □

Theorem 6.

REGw is closed under one-wheel concatenation.

Proof. 

Based on Theorem 1, one may assume that there are two completely defined deterministic finite automata A=(QA,Σ,q0A,δA,FA) and B=(QB,Σ,q0B,δB,FB) that accept the languages LwA and LwB, respectively, in the traditional manner, i.e., LwA=LA and LwB=LB. Then, by the usual method, connecting the accepting states of the automaton A to the initial state of the automaton B by λ-transitions and keeping only the accepting states of B, an NFA A=(QAQB,Σ,q0A,δA,FB) can be obtained that recognizes the language LA·LB, where for aΣλ and qQAQB,

δAq,a=δAq,a,  if qQAq0B,  if qFA and a=λδBq,a,  if qQB

Now, A accepts the two-wheel concatenation of LwA and LwB. On the other hand, the cyclic closure of this language, i.e., the circular word language weakly accepted by A, is exactly the one-wheel concatenation of these two languages, i.e., LwA=LwALwB, completing the proof. □

Theorem 7.

REGw is not closed under two-wheel concatenation, Kleene star and Kleene plus.

Proof. 

Each of these proofs goes by a counterexample. Consider L={aaa,bbb} over the binary alphabet. Clearly, this language is in REGw. Now, on the other hand, all the languages L·L, L* and L+ contain the word aaabbb, but none of them contain its conjugate aabbba, contradicting the fact that the resulting language is a circular word language. This implies that none of L·L, L* and L+ is in REGw. □

Theorem 8.

REGw=REGs.

Proof. 

For any language LwA based on a finite automaton A, we can construct a finite automaton A (based on Theorem 1), such that LwA=LwA, and all the conjugates of any circular word of LwA are in L(A), i.e., LwA=L(A). Thus, LwA=LsA, proving that any language in REGw is also in REGs.

On the other hand, for any language LsA in REGs, as LsA is also regular by Theorem 2, we can also construct a finite automaton A such that LsA=L(A), where LsA=LwA=L(A). Thus, LsA=LwA, proving that any language in REGs is also in REGw. Therefore, the equality of REGw and REGs holds. □

Obviously, the closure properties proven for REGw in Theorems 5–7 apply for REGs as well. In the sequel, we still use this class of languages by REGw, but one may keep in his/her mind that, in fact, this is the same set of languages as REGs.

Finally, we turn to the last part of our results: describing circular word languages by regular-like expressions. If the shuffle of cyclic words is used for concatenation and we define its closure analogously to the Kleene star for traditional concatenation [17], we may start to build regular-like expressions based on the circular words λ and a for aΣ. With shuffle, union and the closure of shuffle, we can actually obtain languages that are closed not only under cyclic permutation but also under commutative closure. In this way, actually, the commutative closures of regular languages are obtained, and thus, they may not be regular or context-free. On the other hand, as these languages are closed under commutative closure, the order of the letters is not really playing any role; thus, instead of normal (linearization) of languages, we may work with them as multiset languages. The class of these languages is very restricted, as even some of our finite circular word languages are not included there; consider that, e.g., {a,aa,aabb,abba,bbaa,baab} is not in this class as, e.g., the word abab is missing if one considers only commutative languages.

Remember that for any language L of circular words, the operation L leaves the language unchanged, i.e., L=L. (Actually, this is an if and only if condition for our circular word languages.)

Now, we may build regular-like expressions; we call them o-regular expressions (circle-regular expressions), using the traditional regular operations and this cyclic closure operation.

Definition 8.

Let an alphabet Σ be given. The symbol λ and each letter of Σ are o -regular expressions. They represent the singleton languages, as in the case of regular expressions. Let g and h be two o -regular expressions; then, (g+h), (g·h), g* and g are also o -regular expressions. They represent the regular languages obtained by the used regular operation and by the cyclic closure, respectively.

Theorem 9.

If an o-regular expression has the cyclic closure as the main operation, then the represented language is a language of circular words; moreover, it is in REGw . On the other hand, if LREGw, then there is a o-regular expression with the above property that represents L.

Proof. 

By the definition of weakly accepted regular circular word languages, it is clear that for any regular language (accepted by an automaton, let us say, A ), its cyclic closure is exactly LwA. In the other direction, if a language LREGw, then based on the automaton A weakly accepting L, a regular expression can be given for L(A), and putting the cyclic closure to this expression as the main operation, an o -regular expression is obtained describing L. □

Further, based on the union normal form for regular expressions, see, e.g., [19,20], we have a more specific form of the o-regular expressions to describe languages of LREGw.

Theorem 10.

For any nonempty language of REGw there is a o-regular expression that, in its tree form, contains the operation in the root; then, the operation union + with some (maybe more than two) children, and only concatenation · and Kleene-star * in the lower levels, has the letters of the alphabet as leaves.

Proof. 

Let LREGw. By transforming the regular expression g to union normal form, where g describes the language L, the solution is provided. □

On the other hand, as all languages in REGw are regular, they can also be described by (ordinary) regular expressions. However, o-regular expressions may be used to write these descriptions in a more concise way, and thus, one can benefit from o-regular expressions from a descriptive point of view.

6. Discussion and Concluding Remarks

Circular words play important roles describing natural circular DNA structures, various mathematical and combinatorial phenomena, etc. In this paper, we have defined two classes of regular languages of circular words with very different approaches. The definitions are based on finite automata and particularly if any or all conjugates of a word are accepted. We have also explored various types of concatenations, namely, one-wheel and two-wheel concatenations of circular words. On the one hand, the regular languages of weakly and strongly accepted circular words were proven to be the same class; moreover, it contains only regular languages, i.e., the linearizations of these circular word languages are regular in the traditional sense. We have also proven some interesting closure properties of this class. Our work gives two different ways that finite automata can be used to accept circular word languages, thus giving a well-known tool for researchers working with circular words (or objects that can be naturally represented by circular words). Further, we have defined o-regular expressions and characterized this class with these expressions. From a complexity point of view, it is an interesting question to see if/when o-regular expressions can give a more compact description of languages in REGw than traditional regular expressions.

Author Contributions

Conceptualization, B.N. and B.A.; methodology, B.N. and B.A.; validation, B.A. and B.N.; formal analysis, B.A. and B.N.; investigation, B.A. and B.N.; writing—original draft preparation, B.A.; writing—review and editing, B.N.; visualization, B.A.; supervision, B.N. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data are contained within the article.

Acknowledgments

A preliminary version of the paper was presented in the conference MCU 2024. The authors gratefully acknowledge the comments of the audience, especially the comments and ideas by Sergey Verlan and Jérôme Durand-Lose.

Conflicts of Interest

The authors declare no conflicts of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures
View Image - Figure 1. The circular word [Forumla omitted. See PDF.] obtained from the linear word [Forumla omitted. See PDF.].

Figure 1. The circular word [Forumla omitted. See PDF.] obtained from the linear word [Forumla omitted. See PDF.].

View Image - Figure 2. A two-wheel concatenation of two circular words obtaining a sequence of circular words.

Figure 2. A two-wheel concatenation of two circular words obtaining a sequence of circular words.

View Image - Figure 3. A one-wheel concatenation of two circular words.

Figure 3. A one-wheel concatenation of two circular words.

References

1. Lothaire, M. Combinatorics on Words; 2nd ed. Cambridge University Press: Cambridge, UK, 1997.

2. Thue, A. Über unendliche Zeichenreihen. Kra. Vidensk. Selsk. Skr. I. Mat. Nat. Kl.; 1906; 7, pp. 1-22.

3. Thue, A. Über Die Gegenseitige Lage Gleicher Teile Gewisser Zeichenreihen; Kra. Vidensk. Christiana Videnskabs-Selskabs Skrifter, I. Math.-Naturv. Klasse Jacob Dybwad: Oslo, Norway, 1912; Volume 46, pp. 1-67.

4. Sawada, J.; Williams, A.; Wong, D. Generalizing the Classic Greedy and Necklace Constructions of de Bruijn Sequences and Universal Cycles. Electron. J. Comb.; 2016; 23, pp. 1-24. [DOI: https://dx.doi.org/10.37236/5517]

5. Smyth, B. Computing Patterns in Strings; Pearson Addison-Wesley: Boston, MA, USA, 2003.

6. Hegedüs, L.; Nagy, B. On Periodic Properties of Circular Words. Discret. Math.; 2016; 339, pp. 1189-1197. [DOI: https://dx.doi.org/10.1016/j.disc.2015.10.043]

7. Hegedüs, L.; Nagy, B. Periodicity of Circular Words. WORDS 2013; TUCS Lecture Notes; TUCS: Turku, Finland, 2013; Volume 20, pp. 45-56.

8. Hegedüs, L.; Nagy, B. Representations of Circular Words. Proceedings of the AFL 2014: Automata and Formal Languages; Szeged, Hungary, 27–29 May 2014; Volume 151, pp. 261-270.

9. Diekert, V.; Harju, T.; Nowotka, D. Factorizations of Cyclic Words. Proceedings of the Workshop on Words and Automata at CSR; Saint Petersburg, Russia, 7 June 2006.

10. Nowotka, D. Periodicity and Unbordered Factors of Words; TUCS Dissertations; TUCS: Turku, Finland, 2004; 50.

11. Rittaud, B.; Vivier, L. Circular Words and Applications. Proceedings of the 8th International Conference WORDS 2011; Prague, Czech Republic, 12–16 September 2011; Volume 63, pp. 31-36.

12. Rittaud, B.; Vivier, L. Circular Words and Three Applications: Factors of the Fibonacci Word, F-Adic Numbers, and the Sequence 1, 5, 16, 45, 121, 320, …. Funct. Approx. Comment. Math.; 2012; 47, pp. 207-231. [DOI: https://dx.doi.org/10.7169/facm/2012.47.2.6]

13. Van Lint, J.H. Introduction to Coding Theory. Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1998; Volume 86.

14. Erickson, J. Algorithms; 1st ed. University of Illinois at Urbana-Champaign: Champaign, IL, USA, 2019.

15. Hopcroft, J.E.; Ullman, J.D. Introduction to Automata Theory, Languages and Computation; Addison-Wesley: Boston, MA, USA, 1979.

16. Jirásková, G.; Okhotin, A. State Complexity of Cyclic Shift. RAIRO Theor. Inform. Appl.; 2008; 42, pp. 335-360. [DOI: https://dx.doi.org/10.1051/ita:2007038]

17. Kudlek, M. On Languages of Cyclic Words. Aspects of Molecular Computing; Jonoska, N.; Păun, G.; Rozenberg, G. Lecture Notes in Computer Science Springer: Berlin/Heidelberg, Germany, 2003; Volume 2950, pp. 278-288.

18. Maslov, A.N. Estimates of the number of states of finite automata. Sov. Math. Dokl.; 1970; 11, pp. 1373-1375.

19. Nagy, B. A Normal Form for Regular Expressions. Proceedings of the Eighth International Conference on Developments in Language Theory, CDMTCS Research Reports CDMTCS-252, Supplemental Papers for DLT’04; Auckland, New Zealand, 13–17 December 2004.

20. Nagy, B. Union-Freeness, Deterministic Union-Freeness and Union-Complexity (invited paper). Proceedings of the DCFS 2019: Descriptional Complexity of Formal Systems; Hospodár, M.; Jirásková, G.; Konstantinidis, S. Lecture Notes in Computer Science Springer: Cham, Switzerland, 2019; Volume 11612, pp. 46-56.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.