massachusetts institute of technology (mit)
computer science and artificial intelligence laboratory (csail)
theory of computation group (toc)

computation and biology group (compbio)

email queries

CURLCAKE is a software to generate Compact Unstructured RNA Libraries Covering All K-mErs.
It is a heuristic algorithm based on finding disjoint paths in a de Bruijn graph.
The RNA sequences corresponding to the paths are unstractured (i.e. likely to have high folding energy).

Input and Output

CURLCAKE takes as input eight parameters:

The algorithm is given:
1. k - the k value to cover.
2. length - length of unique part of the probe.
3. multiplicity - how many times each k-mer has to occur.
4. attempts - a limit on the random extensions t try.
5. output_incomplete_file - filename to output incomplete probe sequences.
6. output_compelte_file - filename to output complete probe sequences.
7. RNAshapes executable (version 2.1.6) (download below)
8. seed (optional) - for random. Default: 0.

It outputs two sequences files as textual files.

CURLCAKE was developed by Yaron Orenstein in Bonnie Berger's group at Massachusetts Institute of Technology: MIT.

Get the software

A binary is available here (requires java 1.7 or higher):

For RNAshapes, cite

Make sure RNAshapes has executable permission (to change in Unix: 'chmod 755 RNAshapes').

Java code is available here:

How to use it

java -jar curlcake.jar <k> <len> <multi> <attempts> <output_incomplete_filename> <output_complete_filename> <RNAshapes_executable> <seed - optional>

Example run:

java -jar curlcake.jar 7 35 1 100 incomplete.txt complete.txt ./RNAshapes 0

Interpreting the output

The output file is a text file containing the sequences. Each file contains a different sequence.