Benchmarks
We use two benchmark sets for regression testing and to demonstrate the effectiveness of Loopfrog. From the original sources, we compiled our own suites, to simplify the benchmarking process. We only added labels to mark specific assertions and adjusted include paths, as we flattened the hierarchy. Our test scripts are included.
- Benchmark Set A: This set is described in
Ku K., Hart T.E., Chechik M., Lie D.: A buffer overflow benchmark for software model checkers. ASE 2007: 389-392.
Original Source: The verisec homepage
Download: Our Suite - Benchmark Set B: This set is described in
Zitser M., Lippmann R., Leek T.: Testing Static Analysis Tools Using Exploitable Buffer Overflows From Open Source Code. FSE 2004.
Original Source: Lincoln Labs Corpora
Download: Our Suite
The results of our evaluation of loopfrog on the benchmark sets are published in [1]
Evaluation on Large-Scale Software
We also evaluate Loopfrog on a benchmark suite of large-scale open-source software. The goto-models for these benchmarks may be obtained from the goto-cc webpage. We present statistics obtained on an 8-core Intel Xeon 3 GHz machine with 16 GB of RAM. We limited the runtime to four hours and the memory to 4 GBper process.
Note: The tests include assertions over strings and array/buffer bounds. Assertions are based on the data from the pointer-analysis; unnecessary assertions were not added (and are not included in the total count).
We evaluated Loopfrog in different configurations. For the results, please choose one of the following categories:
- Default domain set:
- with assume-guarantee reasoning
- without assume-guarantee reasoning
- with assume-guarantee reasoning vs. interval domain
- Only two domains:
References
- Loop Summarization using Abstract Transformers, , 6th International Symposium on Automated Technology for Verification and Analysis (ATVA), Volume 5311, Seoul, South Korea, p.111-125, Springer (2008)