Skip to content

Investigating Execution-Aware Language Models for Code Optimization

Notifications You must be signed in to change notification settings

SpencerLabAQ/exec-aware-code-opt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Investigating Execution-Aware Language Models for Code Optimization

Install dependencies

python3 -m venv execaware
source execaware/bin/activate

pip install -r requirements.txt

Configuration

Most of the scripts make use of the config/config.ini file to define all the paths required for the experiments. We provide the template to fill with the specific paths. Specifically, the following paths must be defined:

  • gem5_path: Path to the gem5 simulator [1].
  • all_input_output_path: Directory containing all test cases provided by CodeNet [2]. The data for these cases is also extracted by the traces collection pipeline discussed in the subsequent section.
  • eval_sandbox_path: Directory designated for storing the evaluation metrics.

Traces collection

We employed the pipeline presented in TRACED [3] to collect execution traces.

Datasets

After collecting the execution traces, proceed by executing the following commands.

The default tracing path are ./tracing/pretraining for pre-training traces and ./tracing/pie for with the fine-tuning traces. To change these paths, it is necessary to modify and thoroughly verify all relevant dataset scripts to ensure compatibility.

In addition, the original version of the PIE [4] dataset is required to build the dataset. The default path is assumed to be ./pie/.

# Build all the datasets required for the experiments
bash dataset.sh

Variable states quantization

Here, we outline the quantization strategy adopted for the variables states as indicated in the related paper.

Variable states quantization strategy

Run Experiments - Model Training and Inference

# Execute all the experiments
bash run_experiment_line_executions.sh
bash run_experiment_line_coverage.sh
bash run_experiment_branch_coverage.sh
bash run_experiment_variable_states.sh

Evaluation

Specify the predictions path and the sandbox name for running the evaluation. Run the script for each prediction file obtained by the inference.

# create evaluation folder
mkdir -p ./eval

# Perform evaluation (example on strategy S1 and line executions LE)
bash eval_finetuning.sh "./s1_ft_LE" "./eval/LE_s1"

References

[1] gem5 Simulator

[2] CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks

[3] TRACED

[4] pie4perf

About

Investigating Execution-Aware Language Models for Code Optimization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published