1. Introduction
The process by which one-dimensional sequences of nucleotides or amino-acids acquire their complex three-dimensional geometries, which are key to their function, is a major puzzle of biology today. Understanding molecular folding will not only shed light on the origin and functions of the molecules existing in nature, it will also enable us to control the process more finely, and engineer artificial molecules with a wide range of uses, from performing missing functions inside living organisms, to producing precisely targeted drugs.
Biomolecular nano-engineering includes DNA self-assembly, which gave rise to an impressive number of successful experimental realizations, from arbitrary 2D shapes [1] to molecule cyclic machines [2], or counters [3]. First pioneered by Seeman [4], DNA nanotechnologies only really started to take off once a computer science model was devised by Winfree [5] to program molecular self assembly in a computer science way.
Since then, many models have been designed to refine different features of experiments: hierarchical self-assembly [6,7], modeling the absence of a seed, kinetic tile assembly [5,8], 3D and probabilistic tile assembly [9], among others.
However, the potential applications of this type of DNA technology inside cells are limited by the high thermal stability of the DNA duplex, where consequently DNA nanostructures are typically formed through the process of annealing: Namely, by heating the molecules up to high temperatures in a precisely controlled environment and then cooling them down at a precisely controlled rate. This assembly paradigm allows researchers to design DNA structures with amazing precision by taking advantage of computational techniques to optimize the thermodynamics of strand-strand interactions, see for instance [10].
However at the same time the method requires precise control over temperature and other variables that are not always possible to control in every environment, such as for example within living cells. Other biopolymers such as RNA and proteins do not share this limitation, as nature uses both mediums to produce exquisite structures on a continuous basis. Researchers have successfully assembled functional nanoparticles and scaffolds out of these alternatives [11,12,13,14]. However, their assembly process is harder for us to program: while the shape that folds depends on both the sequence and the environment, the sequence is read linearly and expressed regardless of the environment. This contrasts with more classical programming models such as Turing machines, or even tile assembly, which are able to jump to different parts of the program depending on the input. Another important difficulty is that even predicting the final folded shape of a biopolymer given the sequence is still the center of active research, especially for proteins [15,16,17,18,19].
In particular, a large body of computer science literature is focused on energy optimization, one of the main drivers of folding. For example, in different variants of the hydrophobic-hydrophilic (HP) model [20], it has been shown that the problem of predicting the most likely geometry (or configuration) of a sequence is NP-complete [21,22,23,24,25,26], both in two and three dimensions, and in different variants of the model.
The kinetics of folding, which is the step-by-step dynamics of the reaction, has been demonstrated by biochemists to play a major role in the final shape of molecules [27], and even play a prevalent role at the heart of the mechanism of RNA switches in biology [28]. In recent experimental results, researchers have even been able to control this mechanism to engineer various things out of RNA represented by the RNA origami architecture for single-stranded rectangular tiles [29], and followed by more complex structural motifs [30] and RNA-origami-based FRET system [31].
This paper introduces a new model of computation and molecular folding inspired by RNA folding, intended to capture the kinetics of folding and model the experiments in [29]. In particular, it focuses on the co-transcriptional nature of RNA folding, whereby molecules fold concurrently while being transcribed (see Figure 1): in computer science terms, the folding process is a local energy optimization, or otherwise put, a greedy algorithm. A key limitation of this model is that the strand folds irreversibly once beads are locked into place, a simplification that makes working with the model tractable, but that also eliminates an important aspect of natural RNA folding—the possibility for structural rearrangement.
The first experimental results have used a standard benchmark: making simple shapes, such as squares (as shown for instance on Figure 2). With the new model introduced in this paper, our goal is twofold: first, explore the engineering possibilities of this mechanism, in order to make arbitrary shapes and structures. Then, the other aim of our study is to understand the complexity of sequence operations, to understand the computational processes which led to the creation of complex molecular networks.
1.1. Main Contributions
In our model, called Oritatami, we consider a sequence of “beads”, which are abstract basic components, standing for nucleotides or even sequences of nucleotides (also called domains). In oritatami, only the latest produced beads of the molecules are allowed to move in order to adopt a more favorable configuration. The folding is driven by the respective attraction between the beads.
Our main construction is a binary counter. Counters are an essential component of many sophisticated constructions in biological computing, in particular in tile assembly [32,33]. Counters are also an important benchmark in experiments [3].
Theorem 1.
There is a fixed periodic sequence 0,1,…,59,0,1,… of period 60 whose rule is given in Figure 3, which, when started from a seed encoding an integer x in binary with at most 2k+1 bits for some k, folds into a structure encoding x+1 , x+2 , …, 22k+1−1 , on the successive rows of the triangular grid.
We prove the correctness of this construction by designing an abstract module system to handle the complexity of the base mechanism of the model, which is about as low-level as assembly code in more standard computing models.
We then show a generic construction method in this model, which we applied to automate parts of the design of the counter. Moreover, this result helps understanding the computational complexity of sequence programming. Precisely, we prove two results in this direction:
Theorem 2.
Designing a single sequence that folds into different target shapes in a set of surrounding environments, is NP-complete in the number of environments.
More surprisingly, it turns out that there is an algorithm to solve this problem in time linear in the length of the sequence. This algorithm is also practical, as we were able to use it to find sequences for our main construction:
Theorem 3.
The sequence design problem is FPT with respect to the length ℓ of the sequence: there is an algorithm linear in ℓ (but exponential in the number of environments) to design a single sequence that folds into the target shapes in the given environments.
As Winfree’s thesis [5] on the very abstract tile assembly model did inspire many successful experimental works (see for instance [3,10,34,35,36]), we hope that the numerous structural strategies employed in Oritatami enabling computation will inspire new architectures for computing in RNA that take advantage of the rapid kinetic-folding of RNA.
1.2. Related Work
The oritatami system has received considerable attention since its proposal in the conference abstract of this paper [37]. As mentioned above, counters serve as a key component in tile self-assembly, especially for assembling shapes. Masuda et al. implemented a small oritatami system that serves as a 4-state automaton with output and integrated it with our binary counter (Theorem 1) into a larger-scale oritatami system towards the self-assembly of an arbitrary finite portion of Heighway dragon fractal [38]. Note that our binary counter works under inertial dynamics and they proposed a modified implementation working under oblivious dynamics with a different ratio of transcription speed to folding (technically speaking, ours works with delay 4 while theirs does with delay 3; delay is formally defined in Section 2). This illustrates the flexibility of oritatami designs. The transcript of their system is periodic as our binary counter; the period is linearly proportional to the size of the target portion. Demaine et al. [39] and Han and Kim [40] independently conducted more comprehensive studies of the shape self-assembly by oritatami systems recently. In particular, Demaine et al. proved that there is a universal set of 114 bead types with a rule set with which an arbitrary finite shape can be folded by an oritatami system, as long as the shape is scaled-up by a small factor.
The FPT algorithm (Theorem 3) has brought about useful modules not only for the binary counter but also for subsequent oritatami systems including the above-mentioned Heighway dragon fractal assembler, Satisfiability (SAT) tautology checker [41], and the time-efficient universal Turing machine simulation in [42]. The construction in [42] is the most intricate oritatami system implemented so far, developing much further the paradigms for co-transcriptional molecular programming introduced here.
Other variants of the rule design problem have been investigated by [43,44]. An alternative heuristic to the FPT algorithm to optimize the rule set has been proposed by Han and Kim [45].
2. Model and Main Results
Given two words a,b∈B* , we denote by ab their concatenation.
2.1. Model
2.1.1. Oritatami System
Oritatami is about the folding of finite sequences of beads, each from a finite set B of bead types, using an attraction rule
A configuration c of a sequence w∈B* is a self-avoiding path of length n=|w| labelled by w in T , i.e., a path whose vertices c1,…,cn are pairwise distinct and labelled by the letters of w. We denote by bt(ci)=wi the bead-type of the i-th bead of c. A partial configuration of a sequence w is a configuration of a prefix of w. For any partial configuration c of some sequence w, an elongation of c by k beads is a partial configuration of w of length |c|+k . We denote by Cw the set of all partial configurations of w (the index w will be omitted when the context is clear). We denote by c▹k the set of all elongations by k beads of a partial configuration c of a sequence w and by c◃k the singleton containing the prefix of length |c|−k of c.
An oritatami system O=(p,
,δ) is composed of (1) a (possibly infinite) transcriptp, which is a sequence of beads, of a type chosen from a finite set B, (2) an attraction rule, which is a symmetric relation
Given an attraction rule
and a configuration c of a sequence w, we say that there is a bond between two adjacent positions ci and cj of c in T if wi
wj . The number of bonds in a configuration c of w, written E(c) , is the negation of the number of bonds within c: formally,
E(c) = −|{(i,j) : ci ∼ cj, j > i + 1, and wi
2.1.2. Oritatami Dynamics
A dynamics for a sequence w is a function Dw:2Cw →2Cw such that for all subset S of partial configurations of length ℓ of w, D(S) is a subset of the elongations by one bead of the partial configurations in S (thus, partial configurations of length ℓ+1 ).
Given an oritatami system O=(p,
We explore greedy folding dynamics where only the most recently transcribed beads can move, all other beads remain in place. These still unsettled beads are said nascent and their number is controlled by the integer parameter δ (in most of this article, δ⩽4 ). We consider two different dynamics to model the “greedy” nature of the process:
The inertial dynamics
was also called hasty dynamics in a preliminary version of this work [37]. It does not question previous choices but chooses the energy-minimal positions for the δ nascent beads among all elongations of the previously adopted partial configurations. It lets the δ−1 already placed nascent beads where they are and abandons the extension of a configuration if no extension with the newly transcribed bead allows to reach a lowest energy configuration available for the δ nascent beads.
Formally, given a set of currently favored nascent configurations, I elongates each of them by one bead, and keeps the elongated configurations that have minimum energy among those who share the same prefix of length |σ|+t :
I(S)=⋃γ∈S◃(δ−1)argminc∈γ▹γ⋂S▹1E(c)
The oblivious dynamics
is oblivious in the sense that it consists of always choosing the best available positions for the nascent δ beads regardless of the previously preferred choices, as opposed to I . Formally, O takes a set of currently favored nascent configurations, removes the last δ−1 positions from all of them, and selects the minimal energy configurations among all of their elongations by δ beads. Precisely:
O(S)=⋃γ∈S◃(δ−1)argminc∈γ▹γE(c)
Oritatami systems seem less governable under the oblivious dynamics than under the inertial dynamics, as a larger number of configurations are to be considered for energy minimization under the oblivious dynamics and also as previously discarded configurations may be reconsidered later on.
With the exception of Section 5, in this article, we choose the inertial dynamics. Our binary counter is described for the inertial dynamics in Section 3. Many subsequent articles [38,39,40,41,42,43,44,45,46] chose the oblivious dynamics. Masuda et al. [38] adapted our binary counter to the oblivious dynamics and used it as a component of their oritatami system folding arbitrary finite portion of Heighway dragon fractal.
Determinism, Halt and Resulting Folding. An oritatami system O=(p,
We say that O stops at time t with seed σ and dynamics D , if Dspt(σ▹(δ−1))=∅ and Dspz({σ})≠∅ for z<t . If O stops at time t, the set Dspt(σ▹(δ−1)) is the result of the folding process; if it consists of a single configuration c, we say that c is the resulting folding and that the oritatami system O folds deterministically into configuration c. If the transcript p is finite, the folding process will stop at time t=|p|−δ+1 or before in case of a geometric obstruction (no more elongation is possible because the configuration gets trapped in a closed area). If the transcript p is infinite, the folding process may only stop because of a geometric obstruction.
Notation for configurations. We will use a↖NWb , a↗NEb , aE→b , aSE↘b , aSW↙b , aW←b to denote a configuration consisting of two beads with types a and b where b is placed respectively at the NW, NE, E, SE, SW or W of a. As an example, the configuration in Figure 4 is described as 30↖NW31↖NW32E→33SE↘34↗NE35E→36SE↘37W←38SW↙39E→40E→41 .
3. Folding a Binary Counter 3.1. General Idea of The Construction
Our construction works with δ=4 . The counter is implemented by folding the periodic sequence of bead types 0,1,…,58,59,0,1,… with period 60. Our construction proceeds in zig-zags as the classic implementation of a counter with a sweeping Turing machine whose head goes back and forth between the two ends of the coding part of the tape. Each pass is 3-rows thick and folds each part of the molecule into a parallelogram of size 4×3 or 6×3 except for the last and the first parts of each pass which are folded into parallelograms of size 3×6 to accomplish the U-turn downwards and start the next pass. The zig pass, folding three rows at a time from right to left, computes the carry propagation in the current value of the counter. The zag pass, folding three rows at a time from left to right, writes down the bits of the newly incremented value, and gets the folding to resume at the right-hand side of the configuration.
The molecule is semantically divided into 4 parts, called modules:
1. Module A (beads 0–11, in blue in all figures): the First Half-Adder
2. Module B (beads 12–29, in red in all figures): the Left-Turn module
3. Module C (beads 30–41, in blue in all figures): the Second Half-Adder
4. Module D (beads 42–59, in red in all figures): the Right-Turn module
Encoding. The current value of the counter is encoded in standard binary with the most significant bit to the left. Each bit is encoded into a specific folding of the modules A and C of the molecule in the rows corresponding to a zag pass: namely folding A0 and C0 for 0, and A1 and C1 for 1. During the zig pass, the value of the carry is encoded by the position of the molecule when it starts to fold Module A or C: carry=0 if Module A or C starts to fold in the top row; carry=1 if Module A or C starts to fold from the bottom row.
In the zig pass (←), modules A and C “read” from the row above the value encoded into the folding in the row above during the previous zag-phase (or in the seed configuration for the first zig pass), and fold into a shape (called a brick, see Section 4.1) A00, A10, A01, A11 or C00, C10, C01, C11 accordingly where A xc is the brick corresponding to the case where x is the bit read in the row above and c is the carry. In the zig pass, modules B and D just propagate the carry value (0 or 1, i.e., start from top or bottom row) output by the preceding module A or C to the next.
When the zig pass reaches the leftmost part of the row on top, the beads there forces the module B to adopt the Left-turn shape which reverses the folding direction and starts the next zag pass.
In the zag pass (→), modules A and C “read” the bricks above A xc or C xc and folds into the bricks that encodes the corresponding bits, namely Ay or Cy where y=(x+c)mod2 . There are no carry propagation and all the modules B and D fold into the same brick B2 or D2 in this pass.
When the molecule reaches the rightmost part of the row on top of it, the special beads there force the module D to fold into the Right-turn brick which reverses the folding direction and starts the next zig pass.
3.2. The First Two Passes of The Folding
Let’s run the first passes of the 3 bits counter to get acquainted with the process.
The seed configuration is shown in Figure 5. The seed configuration for the (2k+1) -bit counter is composed of 4k+3 parts:
1. The first part 20SE↘21SE↘26SE↘27E→28E→29 , made of beads from Module B, encodes a sequence that will trigger the carriage return at the end of the next zig pass.
2. The central part consists in k repetitions of the same sequence of 4 patterns, plus an extra repetition of the first pattern at the end (the central part consists thus in 4k+1 parts in total):
-
30E→39E→40E→41 encoding a bit 0 using beads from Module C,
-
followed by 42E→47E→48E→53E→54E→59 encoding nothing but padding using beads from Module B,
-
followed by 0E→9E→10E→11 encoding a bit 0 using beads from Module A,
-
followed by 12E→17E→18E→23E→24E→29 encoding nothing but padding using beads from Module D.
Note the symmetry by a shift of 30 of the beads values in the patterns involving Modules A and C, and Modules B and D.
3. The last part 42E→48E→50SW↙51W←52SE↘53E→54↗NE55SE↘56SE↘57W←58W←59 , made of beads from Module D, encodes a sequence that will first initiate the next zig pass and later trigger the carriage return at the end of the next zag pass.
Note that the seed configuration ends at the bottom row of the upcoming zig pass, which encodes thus that initially the carry is 1.
The first zig pass (←). Each zig pass starts with a carry equal to 1, i.e., starts folding from the bottom row. In the first zig pass, the first module A (see Figure 6) folds into the brick A01, encoding the bit 1=0+1 with no carry propagation, as a consequence of the carry being 1 and of reading the first bit, 0, from the seed above. Note that the folding A01 ends at the top row, encoding that the carry is now 0. The reading of the bit from the seed is accompl details in Section 3.3.
Then, as illustrated in Figure 7, the next modules B, C, D, and A fold into shapes B0, C00, D0 and A00 respectively: B0 and D0 meaning that no carry is propagated (they start and end on the top row); and C00 and A00 meaning that the (input) carry is 0 and the bit read from the seed is 0.
Finally, as illustrated in Figure 8, the last module, B, of the zig pass binds to the 3-beads long carriage-return pattern at the leftmost part of the seed and folds into the shape BT conducing the molecule to go down by 6 rows, reverse direction and start the following zag pass. Note that the bottom of the shape BT contains the exact same carriage-return pattern.
The first zag pass (→). The zag pass is mostly a normalization pass as illustrated in Figure 9 and Figure 10. It progresses from left to right and computes the new value of each bit by rewriting each shape A00 and A11 as C0, C00 and C11 as A0, A10 and A01 as C1, and C10 and C10 as A0. Shapes A0 and C0 encode 0, and Shapes A1 and C1 encode 1, both to be read during the upcoming zig pass. Modules B and D just fold into the shapes B2 and D2 respectively, encoding nothing but padding.
Finally, as illustrated in Figure 11, the last module, D, of the zag pass binds to the 3-beads long carriage-return pattern in the rightmost part of the seed and folds into the shape DT conducing the molecule to go down by 6 rows, reverse direction and start the next zig pass. Note that, as for the shape BT, the bottom of the shape DT contains the exact same carriage-return pattern.
Figure 12 shows the 3-bits counter folded upto the value 3=011 in binary. One can observe the shape A11 in the second zig pass. A11 corresponds to reading a 1 with a carry 1 which propagates the carry: indeed, the folding ends at the bottom row which propagates the carry to the next module C which folds into C01 as it reads a 0 from above with carry 1. Note that shape A11 is then rewritten as C0 in the following zag pass below.
3.3. How Does Computation Take Place: Modules, Functions, States and Environment
Each module A, B, C or D implements various “functions” that are “called” when the molecule is folded. Which function is called depends on two things:
the current “state” of the molecule:
here, the state is whether the carry is 0 or 1. As mentionned earlier this is encoded in the position of the molecule when the module starts to fold: it starts in the top row if the carry is 0; in the bottom row if the carry is 1.
the local environment of the molecule:
the environment, i.e., the beads already placed around the current area where the folding takes place, acts as the memory in the computation.
The position where the folding of a module starts, determines which beads of a given module will be exposed to and interact with the environment. Then, by creating bonds (or not) with the environment, each module will adopt a specific shape. Therefore, the possible binding schemes will be different depending on this initial position. Similarly, depending on the beads already placed in the environment, the part of the module exposed to it will adopt one form or another depending on how many bonds it can create with the environment. Adopting the language of computer science: the position at which a module starts to fold, determines which function of the module is called; the function then reads the input encoded by the beads already placed in the environment.
Figure 13 provides a precise description on how the function of the Half-Adder Module C are implemented in the zig pass. As the zig pass goes from right to left, the figure is meant to be read from right to left. In the zig pass, Module C implements two functions: (1) Add 1 to the bit above and propagate the carry if needed, or (2) Copy the bit above unchanged. Add is called if the carry is 1 at the beginning of the folding and Copy is called if the carry is 0. The following step-by-step description of the folding explains how:
Beads 30–33 (rightmost column in Figure 13):
if the carry is 0 at start, then bead 30 is able to bind with beads 11 and 12 from the environment and depending on whether the input encodes a bit 0 or 1, bead 32 will be able to bind either to 28 or to 5 and 6 respectively. Whereas if the carry is 1, then bead 30 cannot reach the beads 11 and 12. Thus, these are beads 31 and 32 that will bind with beads 10 and 12 from the environment, giving to the molecule a completely different shape.
Beads 34–37 with carry = 0 at start:
as bead 34 is attracted by both beads 30 and 31, the molecule folds upon itself similarly but with a different rotation depending on whether it has read a 0 or a 1 in the environment above: vertically if it has read a 0, horizontally if it has read a 1. Bead 36 attracted by beads 9 and 27 fixes the end of the tip in place leaving bead 37 free to move for now.
Beads 34–37 with carry = 1 at start:
Bead 34 is attracted by beads 9 and 10 encoding a bit 0 above which allows beads 36 and 37 to bind with 31 as well, but prefers to bind with 31 together with 35 otherwise. This induces two different shapes: the beginning of a “wave” pattern (
) if the bit read above is 0; or the beginning of a “switchback” pattern (
Beads 38–41, without carry propagation (carry = 0, or carry = 1 and bit read = 0):
in these three cases the folding of the beads 38–41 starts from the same position. As the environment is different for each of them, we could design the rule so that this part of the module prefers to adopt the same shape, climing along the part already folded to the top row to start the next module with a carry =0 .
Beads 38-41, with carry propagation (carry = 1 and bit read = 1):
because the switchback pattern is upside down in this case, bead 37 stays low and bead 38 can firmly attach to the top with beads 5, 6 and 33, and the tip of the module folds downwards as 40 and 41 are attracted by 37. This ensures that the folding of the module ends at the bottom row, propagating the carry =1 to the next module.
Note that the bottom rows of the four resulting foldings differ significantly: 39–38–37–30 for C01, 39–38–33–32 for C00, 39–38–37–36 for C10, and 41–36–35–30 for C11. This will allow to distinguish between them in the following zag pass to write the correct bit for the new value of the counter.
4. Proof of Correctness (Theorem 1) 4.1. Overview
As mentioned earlier, each 60-period of the molecule is semantically divided into four modules:
1. Module A (beads 0–11): the First Half-Adder
2. Module B (beads 12–29): the Left-Turn module
3. Module C (beads 30–41): the Second Half-Adder
4. Module D (beads 42–59): the Right-Turn module
Depending on the environment in which they fold upon themselves, each of these modules may adopt the following different configurations. We call bricks the various configurations each module will adopt in the final folding, as they are the bricks upon which the whole folding is built.
There are six bricks for Modules A and C:
1. A xc and C xc in the zig pass where x,c∈{0,1} are the bit read from the brick above (Ax or Cx) and the (input) carry from the preceding module (Bc or Dc);
2. Ay and Cy in the zag pass where y is the bit written: namely y=x+cmod2 if the brick above is A xc or C xc .
There are four bricks for Modules B and D:
1. B c′ and D c′ in the zig pass where c′∈{0,1} is the carry output by the preceding brick C xc or A xc , namely c′=x∧c ;
2. and B2 and D2 in the zag pass.
The proof works in three stages: (1) we describe the targeted final folding and show that this target folding implements correctly a binary counter: i.e., that the bricks implement correctly the relationships between the y, x, c and c′ described above. Then, we show that the molecule folds indeed as prescribed by the target folding. We proceed in two more stages: (2) we enumerate all the possible surroundings for each module in the target folding; and (3) we show by recurrence that each module folds into the desired brick in each of the possible surroudings. For stage 3, our program produces automatically a human-readable folding certificate, which shows the correctness of each step of the folding in each possible environment. It consists in displaying in a compact but readable way, all the possible extensions of the current configuration, and displaying for each of them the number of bonds created (the number in the north-east corner); the maximum bonding configurations are displayed in bold. For readability, we group together extensions when they share a common path up to their last bond (the number of paths grouped is then displayed in the south-east corner for cross-checking). The resulting enumeration is displayed as a tree where only the maximum bond configuration (in bold) gives birth to configurations at the next level. Following the bold path in the tree certificate shows the folding adopted by the molecule. See Figure 14 and Figure 15 for two examples of folding certificates for Brick A01 in the surrounding consisting of the bricks B2, C0, D2 and D1, and for Brick A11 in the surrounding consisting of the bricks B2, C1, D2 and D1. Figure 14 corresponds to the half-adder A reading a 0 from above (Brick C0) and a carry from the right (Brick D1); whereas Figure 15 corresponds to the half-adder A reading a 1 from above (Brick C1) and a carry from the right (Brick D1). Note that the output configuration is exiting the region from the top for Brick A01 (no carry is propagated) and from the bottom for Brick A11 (a carry is propagated).
In the next subsections, we use our new tools to prove the correctness of the Oritatami System presented in Section 3. Precisely, we show that, starting from a proper seed of length 21+20k , the 60-periodic molecule (0,…,59)ω folds upon itself into 2·22k+1−1 rows of height 3 implementing a (2k+1) -bits counter counting from 0 to 22k+1 .
4.2. Description of the Final Configuration (I.e., the Resulting Folding)
Let us first describe each of the possible bricks for each module:
1. Module A, First Half-Adder (beads 0–11):
2.
Module B, Left-Turn module (beads 12–29)
3.
Module C, Second Half-Adder (beads 30–41)
4.
Module D, Right-Turn module (beads 42–59)
Note the similarities between the two half-adders A and C in their folding. Using two similar but different modules A and C allow to avoid interference and simplify the design.
The final configuration. In order to program the molecule, we have partitioned the triangular grid into regions that will be populated each by a module adopting one of the brick configurations above. The regions and the bricks populating them are displayed in Figure 16. With the exception of the seed region on top, the regions consist in parallelograms of size 4×3 , 6×3 or 3×6 organized into 2×22k+1 rows of height 3 and 4k+3 columns of widths 3 or 6. The regions are to be populated as follows to implement a (2k+1) -bits counter, as illustrated on Figure 16:
1. The rightmost and leftmost columns have width 3:
-
The leftmost column consists in 22k+1 (3×6) -regions all populated with the brick BT (Left Turn).
-
The rightmost column consists in 22k+1 (3×6) -regions all populated with the brick DT (Right Turn).
2. The 4k+1 inner columns consist in 4×3 -parallelogram regions if odd and 6×3 -parallelogram regions if even. The rows consist of an alternation of Zig- and Zag-rows to be read from right to left and from left to right, respectively. The rows 2i+1 and 2i+2 take care of reading i in binary from the row 2i above, incrementing it in row 2i+1 , and writing i+1 in row 2i+2 . In order to describe precisely the folding, let us denote by ij the jth lowest weight bits of i∈N when written in binary and by ρi the position of the lowest-weight 0-bit of i: ρi=min{j:ij=0} . ρi is the position up to which the carry propagates when one increments i: ρ=(1,2,1,4,1,2,1,4,1,2,1,8,1,2,…) . Precisely, the region in the pth inner row, 1⩽p⩽2·22k+1−1 and the qth inner column, 1⩽q⩽4k+1 is:
-
if p=2i+1 is odd and q=2r is even: a 3×6 -parallelogram populated with Brick Kc , where:
*
K=B if r is even, and K=D if r is odd;
*
c=1 if r/2<ρi (there still is carry to propagate), and c=0 if r/2≥ρi (there is no more carry to propagate).
-
if p=2i and q=2r are even: a 3×6 -parallelogram populated with Brick B2 if r is even, or D2 if r is odd.
-
if p=2i is even and q=2r+1 is odd: a 4×3 -parallelogram populated with Brick A ir if r is odd and C ir if r is even.
-
if p=2i+1 and q=2r+1 are odd: a 4×3 -parallelogram populated with Brick Kirc where:
*
K=A if r is even, and K=C if r is odd;
*
c=1 if r/2⩽ρi (there still is carry to propagate), and c=0 if r/2>ρi (there is no more carry to propagate).
3. The seed row on top consists in the bottom row of the brick sequence BT,C0,(D2,A0,B2,C0 )k ,DT.
4. The leftmost region of the last row is where the folding stops (counter capacity exceeded).
4.3. Input and Output Nascent Configurations
Recall that the inertial dynamics extends only the favored nascent configurations from the previous time step (i.e., the one that had the maximum number of bonds). We call this set of configurations, the output nascent configurations of the previous time step, or the input nascent configurations of the present time step. In this section, we list all the possible input/output nascent configurations according to our design.
The lemmas in Supplementary Materials Section S.1.2 sum up the results and yield by a simple induction that the construction is correct. Let us denote α,α′,α′′,α′′′,β,β′,γ,γ′,θ,θ′,λ2,λ2′,λ3,λ3′,λ4,λ4′,λ5,μ,μ′ all the possible sets of output nascent configurations for all bricks as illustrated in Figure 17. Note that output nascent configurations β,β′,γ,γ′,θ,θ′ do not represent a single configuration but a set of configurations as the position of the last bead d is not fixed.
Figure 18 illustrates the results proven in the following lemmas and demonstrates that the induction is correct and proves that the bricks fold one after the other so that the molecule folds indeed into the claimed final configuration presented in Figure 16.
Proof of Correctness. The proof of Theorem 1 proceeds by induction: it is enough to show that each module folds into the expected brick in each region. The folding of each module depends on its environment, i.e., on the bricks already folded nearby and on the current minimum energy configurations output by the previous step. First, Supplementary Materials Section S.1.1 enumerates all the possible environments for each module. Then, Supplementary Materials Section S.1.2 shows that if each module folds as expected, then Theorem 1 is correct. Finally, Supplementary Materials Section S.2 provides all the folding certificates proving that each module folds as expected in every possible environment which concludes the proof of correctness of our 60-beads long periodic oritatami system implementing a binary counter using inertial dynamics.
5. Rule Design Is NP-Hard and FPT
Our second main result concerns the design of a rule for achieving a set of given foldings depending on the environment.
The rule design problem (RDP) consists in the following:
Input:
Two disjoint sets of bead types B and {1,…,n} , a delay δ , k seed configurations σ1,…,σk of sequences s1,…,sk∈B* (with possibly different lengths) and k target configurations c1,…,ck of sequence p=〈1,…,n〉 of length n, and a dynamics D∈{O,I} .
Output:
A rule
⊆(B⊔{1,…,n})2 such that for all i=1..k , the Oritatami system O=(p,
Figure 1.An RNA molecule folding over itself while being transcribed, as the experiments in [29].
Figure 2.The design of a square, co-transcriptionally folded with RNA, and the corresponding path on the triangular lattice.
Figure 3.The rule
Figure 4.A configuration encoded by30↖NW31↖NW32E→33SE↘34↗NE35E→36SE↘37W←38SW↙39E→40E→41.
Figure 5.The seed configuration for the 3-bits counter encoding the three bits 000 as the initial value of the counter.
Figure 6.The folding of the first module, A: starting with a carry 1, encoded by the position of the first bead (on the bottom row), this module “reads” a 0 from the seed by binding to the seed, and folds into A01, encoding a 1 with no carry propagation, as encoded by the position of the last bead (on the top row of the module).
Figure 7.The folding of the central part of the first zig pass in the 3-bits counter.
Figure 8.In our construction, the leftmost three beads of any configuration are different from the other beads the left U-turn module binds to inside the zig or zag pass: when the left U-turn module folds next to these bead types, it “triggers” the production of an actual U-turn.
Figure 9.During the zag pass, all modules start from the bottom row, computing the value of each new bit by rewriting shapes A00 and A11 as C0, C00 and C11 as A0, A10 and A01 as C1, and C10 and C10 as A1.
Figure 10.At the end of the first zag pass, the new value of each bit have been encoded into shapes: A0 or C0 for bits equal to 0, A1 or C1 for bit equal to 1.
Figure 11.Finally, at the end of the first zag pass, the last module D binds to the carriage-return pattern in the seed and fold into the shape DT to accomplish the right U-turn from which the next zig pass can start.
Figure 12.The folding of the 3-bits counter upto value3=011in binary. Observe the carry propagation in the second zig pass.
Figure 13.An illustration of how the module C applies a different function which results in different foldings according to the initial state of the molecule (carry = 0 or 1) at the beginning of the folding of the module, and to the environment (the bit 0 or 1 encoded) read above by the function. This figure is meant to be read from right to left (zig pass ←).
Figure 14.The folding certificate for the brick A01 in the environment:
Figure 15.The folding certificate for the brick A11 in the environment:
Figure 16.Triangular grid partition into regions populated with the proper bricks. Color coding: Seed-row in orange; Zig-rows (←) in yellow; Zag-rows (→) in blue; Left Turns (↪) in pink; and Right Turns (↩) in green.
Figure 17.Notation for the various possible sets of output nascent configurations.
Figure 18.The brick automaton illustrating how Lemmas 1–10 (Supplementary Materials) work together to prove Theorem 1 by induction.
Figure 19.The polynomial-time reduction for 3-SAT to the rule design problem for delayδ=1 . The seed configurations are linked in orange, and the target configurations are linked in turquoise. (a) The seed-target configuration pair for clauseℓi∨ℓj∨ℓk: it is deterministically foldable iff 1 is attracted by at least one ofℓi,ℓj,ℓk ; (b) The seed-target configuration pair for variablexi : it is deterministically foldable iff 1 is attracted by r and 1 is attracted by at most one of z,xiandxi¯.
Figure 20.The reduction for 3-SAT to the rule design problem for delayδ≥2 . (a) The seed-target configuration pair for clauseℓi∨ℓj∨ℓk: it is deterministically foldable iffδis attracted by at least one ofℓi,ℓj,ℓk . (b) The seed-target configuration pair for variablexi: it is deterministically foldable iffδ is attracted by r andδis attracted by at most one ofxiandxi¯.
Supplementary Materials
The following are available online at https://www.mdpi.com/1422-0067/20/9/2259/s1.
Author Contributions
All authors contributed to: conceptualization, methodology, validation, formal analysis, investigation, resources, writing-original draft preparation and review and editing, supervision, project administration. Software and visualization, P.-É.M. and N.S.
Funding
C. Geary is a Carlsberg Postdoctoral Fellow supported by the Carlsberg Foundation. N. Schabanel is supported by Grants ANR-12-BS02-005 RDAM, IXXI MOLECAL, IXXI CALCASMOL, CNRS MoPrExProgMol and CNRS AMARP grants. S. Seki is in part supported by the Academy of Finland, Postdoctoral Researcher Grant 13266670/T30606, JST Program to Disseminate Tenure Tracking System, MEXT, Japan, No. 6F36 and JSPS KAKENHI Grant-in-Aid for Young Scientists (A) No. 16H05854 and for Challenging Research (Exploratory) No. 18K19779.
Acknowledgments
The authors thank Abdulmelik Mohammed, Andrew Winslow, Damien Woods for discussions and encouragements.
Conflicts of Interest
The authors declare no conflict of interest.
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
1. Rothemund, P.W.K. Folding DNA to create nanoscale shapes and patterns. Nature 2006, 440, 297-302.
2. Yurke, B.; Turberfield, A.J.; Mills, A.P.; Simmel, F.C.; Neumann, J.L. A DNA-fuelled molecular machine made of DNA. Nature 2000, 406, 605-608.
3. Evans, C.G. Crystals that Count! Physical Principles and Experimental Investigations of DNA Tile Self-Assembly. Ph.D. Thesis, California Institute of Technology, Pasadena, CA, USA, 2014.
4. Seeman, N.C. Nucleic-acid junctions and lattices. J. Theor. Biol. 1982, 99, 237-247.
5. Winfree, E. Algorithmic Self-Assembly of DNA. Ph.D. Thesis, Caltech, Pasadena, CA, USA, 1998.
6. Cannon, S.; Demaine, E.D.; Demaine, M.L.; Eisenstat, S.; Patitz, M.J.; Schweller, R.; Summers, S.M.; Winslow, A. Two Hands Are Better Than One (up to constant factors). In Proceedings of the 30th International Symposium on Theoretical Aspects of Computer Science, Kiel, Germany, 27 February-2 March 2013; pp. 172-184.
7. Chen, H.L.; Doty, D. Parallelism and Time in Hierarchical Self-Assembly. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms, Kyoto, Japan, 17-19 January 2012; pp. 1163-1182.
8. Fujibayashi, K.; Zhang, D.Y.; Winfree, E.; Murata, S. Error suppression mechanisms for DNA tile self-assembly and their simulation. Nat. Comput. 2009, 8, 589-612.
9. Cook, M.; Fu, Y.; Schweller, R.T. Temperature 1 Self-Assembly: Deterministic Assembly in 3D and Probabilistic Assembly in 2D. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, CA, USA, 23-25 January 2011.
10. Woods, D.; Doty, D.; Myhrvold, C.; Hui, J.; Zhou, F.; Yin, P.; Winfree, E. Diverse and robust molecular algorithms using reprogrammable DNA self-assembly. Nature 2019, 567, 366-372.
11. Afonin, K.A.; Bindewald, E.; Yaghoubian, A.J.; Voss, N.; Jacovetty, E.; Shapiro, B.A.; Jaeger, L. In vitro Assembly of Cubic RNA-Based Scaffolds Designed in silico. Nat. Nanotechnol. 2010, 5, 676-682.
12. Afonin, K.A.; Kireeva, M.; Grabow, W.W.; Kashiev, M.; Jaeger, L.; Shapiro, B.A. Co-transcriptional Assembly of Chemically Modified RNA Nanoparticles Functionalized with siRNAs. Nano Lett. 2012, 12, 5192-5195.
13. Li, M.; Zheng, M.; Wu, S.; Tian, C.; Weizmann, Y.; Jiang, W.; Wang, G.; Mao, C. In vivo production of RNA nanostructures via programmed folding of single-stranded RNAs. Nat. Commun. 2018, 9, 2196.
14. Schwarz-Schilling, M.; Dupin, A.; Chizzolini, F.; Krishnan, S.; Mansy, S.S.; Simmel, F.C. Optimized Assembly of a Multifunctional RNA-Protein Nanostructure in a Cell-Free Gene Expression System. Nano Lett. 2018, 18, 2650-2657.
15. Zuker, M.; Sankoff, D. RNA secondary structures and their prediction. Bull. Math. Biol. 1984, 46, 591-621.
16. Mathews, D.H.; Disney, M.D.; Childs, J.L.; Schroeder, S.J.; Zuker, M.; Turner, D.H. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl. Acad. Sci. USA 2004, 101, 7287-7292.
17. Mathews, D. Revolutions in RNA secondary structure prediction. J. Mol. Biol. 2006, 359, 526-532.
18. Rivas, E. The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective. RNA Biol. 2013, 10, 1185-1196.
19. Jabbari, H.; Condon, A. A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures. BMC Bioinform. 2014, 15, 147.
20. Dill, K. Theory for the folding and stability of globular proteins. Biochemistry 1985, 24, 1501-1509.
21. Unger, R.; Moult, J. Finding the lowest free energy conformation of a protein is an NP-hard problem: Proof and implications. Bull. Math. Biol. 1993, 55, 1183-1198.
22. Paterson, M.; Przytycka, T. On the complexity of string folding. In Proceedings of the 23rd International Colloquium on Automata, Languages, and Programming, Paderborn, Germany, 8-12 July 1996; Meyer, F., Monien, B., Eds.; Springer: Berlin/Heidelberg, Germany, 1996; Volume 1099, pp. 658-669.
23. Atkins, J.; Hart, W.E. On the Intractability of Protein Folding with a Finite Alphabet of Amino Acids. Algorithmica 1999, 25, 279-294.
24. Berger, B.; Leighton, T. Protein folding in the hydrophobic-hydrophilic (HP) model is NP-complete. J. Comput. Biol. 1998, 5, 27-40.
25. Crescenzi, P.; Goldman, D.; Papadimitriou, C.; Piccolboni, A.; Yannakakis, M. On the complexity of protein folding. J. Comput. Biol. 1998, 5, 423-465.
26. Aichholzer, O.; Bremner, D.; Demaine, E.D.; Meijer, H.; Sacristán, V.; Soss, M. Long proteins with unique optimal foldings in the H-P model. Comput. Geom. 2003, 25, 139-159.
27. Boyle, J.; Robillardl, G.; Kim, S. Sequential folding of transfer RNA. A nuclear magnetic resonance study of successively longer tRNA fragments with a common 5' end. J. Mol. Biol. 1980, 139, 601-625.
28. Frieda, K.L.; Block, S.M. Direct observation of cotranscriptional folding in an adenine riboswitch. Science 2012, 338, 397-400.
29. Geary, C.; Rothemund, P.W.K.; Andersen, E.S. A Single-Stranded Architecture for Cotranscriptional Folding of RNA Nanostructures. Science 2014, 345, 799-804.
30. Geary, C.; Chworos, A.; Verzemnieks, E.; Voss, N.R.; Jaeger, L. Composing RNA Nanostructures from a Syntax of RNA Structural Modules. Nano Lett. 2017, 17, 7095-7101.
31. Jepsen, M.D.E.; Sparvath, S.M.; Nielsen, T.B.; Langvad, A.H.; Grossi, G.; Gothelf, K.V.; Andersen, E.S. Development of a Genetically Encodable FRET System using Fluorescent RNA Aptamers. Nat. Commun. 2018, 9, 18.
32. Doty, D.; Lutz, J.H.; Patitz, M.J.; Schweller, R.T.; Summers, S.M.; Woods, D. The tile assembly model is intrinsically universal. In Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science, New Brunswick, NJ, USA, 20-23 October 2012; pp. 439-446.
33. Meunier, P.E.; Patitz, M.J.; Summers, S.M.; Theyssier, G.; Winslow, A.; Woods, D. Intrinsic universality in tile self-assembly requires cooperation. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, Portland, OR, USA, 5-7 January 2014.
34. Rothemund, P.W.K.; Papadakis, N.; Winfree, E. Algorithmic Self-Assembly of DNA Sierpinski Triangles. PLoS Biol. 2004, 2, e424.
35. Fujibayashi, K.; Hariadi, R.; Park, S.H.; Winfree, E.; Murata, S. Toward Reliable Algorithmic Self-Assembly of DNA Tiles: A Fixed-Width Cellular Automaton Pattern. Nano Lett. 2008, 8, 1791-1797.
36. Wei, B.; Dai, M.; Yin, P. Complex shapes self-assembled from single-stranded DNA tiles. Nature 2012, 485, 623-626.
37. Geary, C.; Meunier, P.E.; Schabanel, N.; Seki, S. Programming Biomolecules That Fold Greedily During Transcription. In Proceedings of the 41st International Symposium on Mathematical Foundations of Computer Science, Krakow, Poland, 22-26 August 2016; Volume 58, p. 43.
38. Masuda, Y.; Seki, S.; Ubukata, Y. Towards the Algorithmic Molecular Self-assembly of Fractals by Cotranscriptional Folding. In Proceedings of the 23rd International Conference on Implementation and Application of Automata, Charlottetown, PE, Canada, 30 July-2 August 2018; Volume 10977, pp. 261-273.
39. Demaine, E.; Hendricks, J.; Olsen, M.; Patitz, M.J.; Rogers, T.A.; Nicolas, S.; Seki, S.; Thomas, H. Know When to Fold 'Em: Self-assembly of Shapes by Folding in Oritatami. In Proceedings of the 24th International Conference on DNA Computing and Molecular Programming, Jinan, China, 8-12 October 2018; Volume 11145, pp. 19-36.
40. Han, Y.S.; Kim, H. Construction of Geometric Structure by Oritatami System. In Proceedings of the International Conference on DNA Computing and Molecular Programming, Jinan, China, 8-12 October 2018; Volume 11145, pp. 173-188.
41. Han, Y.S.; Kim, H.; Ota, M.; Seki, S. Nondeterministic seedless oritatami systems and hardness of testing their equivalence. Nat. Comput. 2018, 17, 67-79.
42. Geary, C.; Meunier, P.E.; Schabanel, N.; Seki, S. Proving the Turing Universality of Oritatami Co-Transcriptional Folding. In Proceedings of the 29th International Symposium on Algorithms and Computation, Taiwan, China, 16-19 December 2018; Volume 123, p. 23.
43. Ota, M.; Seki, S. Rule Set Design Problems for Oritatami Systems. Theor. Comput. Sci. 2017, 671, 26-35.
44. Han, Y.S.; Kim, H.; Seki, S. Transcript Design Problem of Oritatami Systems. In Proceedings of the International Conference on DNA Computing and Molecular Programming, Jinan, China, 8-12 October 2018; Volume 11145, pp. 139-154.
45. Han, Y.S.; Kim, H. Ruleset Optimization on Isomorphic Oritatami Systems. In Proceedings of the 23rd International Conference on DNA Computing and Molecular Programming, Austin, TX, USA, 24-28 September 2017; Volume 10467, pp. 33-45.
46. Rogers, T.A.; Seki, S. Oritatami System: A Survey and the Impossibility of Simple Simulation at Small Delays. Fund. Inform. 2017, 154, 359-372.
47. Woods, D.; Chen, H.L.; Goodfriend, S.; Dabby, N.; Winfree, E.; Yin, P. Active Self-Assembly of Algorithmic Shapes and Patterns in Polylogarithmic Time. In Proceedings of the ITCS 2013: Innovations in Theoretical Computer Science, Berkeley, CA, USA, 10-12 January 2013; pp. 353-354.
1Computer Science Computation and Neural Systems Bioengineering Caltech, MS 136-93, Moore Building, Pasadena, CA 91125, USA
2Computer Science Dept, Hamilton Institute, Maynooth University, Co. Kildare, Ireland
3CNRS, École normale supérieure de Lyon (LIP), CEDEX 07, 69364 Lyon, France
4Computer and Network Engineering Dept, University of Electro-Communications, 1-5-1, Chofugaoka, Chofu, Tokyo 1828585, Japan
*Author to whom correspondence should be addressed.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2019. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
First pioneered by Seeman [4], DNA nanotechnologies only really started to take off once a computer science model was devised by Winfree [5] to program molecular self assembly in a computer science way. In particular, it focuses on the co-transcriptional nature of RNA folding, whereby molecules fold concurrently while being transcribed (see Figure 1): in computer science terms, the folding process is a local energy optimization, or otherwise put, a greedy algorithm. [...]this result helps understanding the computational complexity of sequence programming. The FPT algorithm (Theorem 3) has brought about useful modules not only for the binary counter but also for subsequent oritatami systems including the above-mentioned Heighway dragon fractal assembler, Satisfiability (SAT) tautology checker [41], and the time-efficient universal Turing machine simulation in [42].
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer