Bayou

Bayou is a data-driven program synthesis system for Java that uses learned Bayesian specifications for efficient synthesis.

arXiv paper on Bayou.

There are three main modules in Bayou:

driver: extracts sketches (in the DSL) and evidences from a Java program to generate the training data
model: implements the BED neural network (see paper), word embeddings, their training and inference procedures
synthesizer: performs combinatorial enumeration and concretizes a sketch sampled from the BED during inference into a Java program

Requirements

JDK 1.8
Python3 (Tested with 3.5.1)
Tensorflow (Tested with 1.2)
scikit-learn (Tested with 0.18.1)

Compiling and Running Bayou from Source on Ubuntu

1.) Download source from GitHub:

git clone https://github.com/capergroup/bayou.git

2.) Install Build Tools

cd bayou/tool_files/build_scripts
sudo ./install_deps.sh

3.) Compile Bayou

./build.sh

4.) Install Bayou Dependenices

cd out
chmod +x install_dependencies_apt.sh
sudo ./install_dependencies_apt.sh

Or, install_dependencies_mac.sh for Macintosh.

5.) Run Bayou

chmod +x start_bayou.sh synthesize.sh
./start_bayou.sh &

Wait until you see:

===================================
            Bayou Ready            
===================================

then execute:

./synthesize.sh

You should see output that ends with characters similar to:

/* --- End of application --- */


import edu.rice.bayou.annotations.Evidence;
import java.io.IOException;
import java.io.FileReader;
import java.io.BufferedReader;
import java.io.FileNotFoundException;

public class TestIO1 {

    @Evidence(apicalls = {"readLine", "ready"})
    void __bayou_fill(String file) {
		String s1;
		String s;
		boolean b;
		BufferedReader br;
		FileReader fr;
		try {
			fr = new FileReader(file);
			br = new BufferedReader(fr);
			while (b = br.ready()) {
				s = br.readLine();
				s1 = br.readLine();
			}
			br.close();
		} catch (FileNotFoundException _e) {
		} catch (IOException _e) {
		}
	}

}
/* --- End of application --- */

Setup & Usage

Driver

cd /path/to/bayou/src/pl
ant

If you are working with the Android SDK,

export CLASSPATH=/path/to/android.jar

Bayou has an android.jar from Android 24 under the lib directory if needed.

After setup, run tests to ensure everything is fine:

cd scripts
python3 test_driver.py

Use the following command to run the driver on Program.java with the config file config.json:

java -jar /path/to/bayou/src/pl/out/artifacts/driver/driver.jar -f Program.java -c config.json [-o output.json]

Run driver with -h for details about the config file. The -o option can be used to output the sketch to a JSON file.

To create a single JSON file with the entire dataset, append the JSON files from each program and create a top level JSON entity called "programs" that has the entire list as the value. For example, if you have files Program1.json, ... Program10000.json, then the dataset should have the content:

{
  "programs": [
    <Program1.json>,
    <Program2.json>,
    ...
    <Program10000.json>
  ]
}

Model

First, set the evironment variable

cd /path/to/bayou/src/ml
export PYTHONPATH=`pwd`

To extract evidences from a data file DATA.json generated by the driver,

cd /path/to/bayou/src/ml/bayou/core
python3 utils.py DATA.json DATA-with-evidences.json [--max_seqs N] [--max_seq_length M]

This will create DATA-with-evidences.json with the evidences extracted from DATA.json. You can filter the programs from which evidences are extracted using the optional arguments. Run utils.py with -h for more information about these arguments.

To train LDA embeddings on evidences in the data file,

cd /path/to/bayou/src/ml/bayou/lda
python3 train.py --ntopics <N> --evidence <evidence_type> DATA-with-evidences.json --save save

where <evidence_type> is the type of evidence for which the embeddings are to be trained. As before, run train.py with -h for details about these arguments. The trained embeddings will be in the directory specified by --save (default save)

To train the BED neural network on the data file,

cd /path/to/bayou/src/ml/bayou/core
python3 train.py --config config.json DATA-with-evidences.json --save save

Run train.py with -h for details about the config file.

Note: The BED network will look for the pre-trained embeddings for each evidence type in the directory specified by --save (default save). The embeddings for each evidence must be in a directory named "embed_<evidence_type>" within the save directory. For instance, the embeddings for types should be in the directory save/embed_types, and the embeddings for apicalls should be in save/embed_apicalls. Copy the file(s) from where you saved the LDA models for each evidence type into these directories here.

Synthesizer

Suppose that the trained model is in a folder trained. Run the server to load the trained model into memory. The server will listen to a pipe (here bayoupipe) for inference queries:

mkdir server; cd server
python3 /path/to/bayou/scripts/server.py --save /path/to/trained --pipe bayoupipe

The synthesizer requires as input a Java class with:

a method named __bayou_fill that can be empty
arguments to this method that can be used for synthesis
evidences towards synthesis with the method annotation @Evidence

See examples in test/pl/synthesizer for more information about the input format.

Use the provided scripts/synthesize.sh for running the synthesizer. First, set the environment variables BAYOU_HOME and BAYOU_SERVER (and also bayoupipe if you used a different name for the pipe) in this script to the home folder of bayou and where you started the server, respectively. Then, to run the synthesizer on a file Program.java:

synthesize.sh Program.java

If all went well, the synthesizer should output a set of Java programs with the body of the method __bayou_fill synthesized according to the arguments and evidences provided.

Roadmap

Model: Encode natural language evidence (Javadoc) better
Synthesizer: Extract evidence from surrounding context instead of __bayou_fill
General: Gather more training data from a larger corpus

Name		Name	Last commit message	Last commit date
Latest commit History 381 Commits
example_inputs		example_inputs
src		src
tool_files		tool_files
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayou

Requirements

Compiling and Running Bayou from Source on Ubuntu

1.) Download source from GitHub:

2.) Install Build Tools

3.) Compile Bayou

4.) Install Bayou Dependenices

5.) Run Bayou

Setup & Usage

Driver

Model

Synthesizer

Roadmap

About

Releases 15

Packages

Contributors 5

Languages

License

trishullab/bayou

Folders and files

Latest commit

History

Repository files navigation

Bayou

Requirements

Compiling and Running Bayou from Source on Ubuntu

1.) Download source from GitHub:

2.) Install Build Tools

3.) Compile Bayou

4.) Install Bayou Dependenices

5.) Run Bayou

Setup & Usage

Driver

Model

Synthesizer

Roadmap

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 15

Packages 0

Contributors 5

Languages

Packages