Commit dcd51d31 authored by ehilin02's avatar ehilin02
Browse files

section 6

parent e49d88f4
%!TEX root = atva2021.tex
%!TEX root = tab2021.tex
%\newpage
......@@ -25,7 +25,7 @@ supports the new theory \slah.
Internally, \cspenp\ parses the input file
which shall include the definition of $\hls{}$
%ESOP 21 Reviewer 3: so this isn't built in specially? Or is it just that the tool expects the definition to be there because of the format? Does the definition have to be in syntactically exactly the form expected?
and the satisfiability or entailment queries in the SL-COMP format~\cite{SLCOMPsite}.
and the satisfiability or entailment queries in the SL-COMP format~\cite{DBLP:conf/tacas/SighireanuPRGIR19}.
%
\begin{compactitem}
\item For satisfiability queries,
......@@ -71,9 +71,9 @@ and the satisfiability or entailment queries in the SL-COMP format~\cite{SLCOMPs
\medskip
\mypar{Benchmarks.}
We generated 133 benchmarks,
We generated 190 benchmarks,
available at~\cite{CompSpenSite} (directory \texttt{samples}),
%with 66 and 67 instances for the satisfiability and entailment problem respectively. These benchmarks are
%with 88 and 102 instances for the satisfiability and entailment problem respectively. These benchmarks are
classified into four suites, whose sizes are given in Table~\ref{tab-exp}, as follows:
%%MS: not clear what is the number here, leave it to the table.
%MEM-SAT (38), MEM-ENT (38), RANDOM-SAT (28), RANDOM-ENT (29), where SAT and ENT represent satisfiability and entailment respectively, moreover,
......@@ -87,12 +87,12 @@ classified into four suites, whose sizes are given in Table~\ref{tab-exp}, as fo
or search an address inside a heap-list.
%
\item RANDOM-SAT and RANDOM-ENT are satisfiability resp. entailment problems
which are randomly generated.
which are randomly or manually generated.
Starting from the path and verification conditions for programs manipulating heap lists,
we replace some $\hls{}$ atoms with their unfoldings in order to generate
formulas with more spatial atoms.
RANDOM-SAT includes also formulas where the atoms and their start
and end address terms are generated randomly.
and end address terms are generated randomly. In addition, we generate some benchmarks manually.
This suite is motivated by testing the scalability of \cspenp.
%which usually contain more spatial atoms than those in MEM-SAT and MEM-ENT, motivated by testing the scalability of \cspenp.
\end{itemize}
......@@ -118,7 +118,7 @@ classified into four suites, whose sizes are given in Table~\ref{tab-exp}, as fo
\medskip
\mypar{Experiments.}
We run \cspenp over the four benchmark suites, using a Ubuntu-16.04 64-bit lap-top with an Intel Core i5-8250U CPU and 2GB RAM.
The experimental results are summarized in Table~\ref{tab-exp}.
The experimental results are summarized in Table~\ref{tab-exp}. We set the timeout to 30 seconds. The statistics of average time and maximum time do not include the time of timeout instances.
To the best of our knowledge, there have not been solvers that are capable of solving the \slah\ formulas that include, points-to, block, and $\hls{}$ atoms.
The solver SLar~\cite{KimuraT17} was designed to solve entailment problems involving points-to, block, and $\ls{}$ atoms. Nevertheless, we are unable to find a way to access SLar, thus failing to compare with it on ASL formulas.
Moreover, the examples used by SLar, available online, are in a format that seems nontrivial to translate into the SL-COMP format.
......@@ -134,24 +134,24 @@ To the best of our knowledge, there have not been solvers that are capable of so
%\small
\setlength{\tabcolsep}{1eX}
\renewcommand{\arraystretch}{1.2}
\begin{tabular}{| c | c | c | c | c |}
\begin{tabular}{| c | c | c | c | c | c |}
\hline
% & & Benchmark& \multicolumn{3}{c}{Time (in seconds)} \\\
Benchmark suite & $\#$instances & Avg. time (s) & Min. time (s) & Max. time (s) \\
Benchmark suite & $\#$instances & Timeout & Avg. time (s) & Min. time (s) & Max. time (s) \\
\hline
\hline
MEM-SAT & 38 & 0.05 & 0.02 &0.12 \\
MEM-SAT & 38 & 0 & 0.05 & 0.03 &0.12 \\
\hline
RANDOM-SAT & 28 & 0.04 & 0.02 & 0.09 \\
RANDOM-SAT & 50 & 0 & 0.09 & 0.02 & 0.53 \\
\hline
TOTAL & 66 & 0.05 & 0.02 & 0.12 \\
TOTAL & 88 & 0 & 0.07 & 0.02 & 0.53 \\
\hline
\hline
MEM-ENT & 38 & 2.98 & 0.73 & 9.05 \\
MEM-ENT & 43 & 0 & 3.05 & 0.34 & 9.98 \\
\hline
RANDOM-ENT & 29 & 2.04 & 0.04 & 17.99 \\
RANDOM-ENT & 59 & 7 & 6.66 & 0.04 & 29.57 \\
\hline
TOTAL & 67 & 2.57 & 0.04 & 17.99 \\
TOTAL & 102 & 7 & 5.02 & 0.04 & 29.57 \\
\hline
\end{tabular}
\end{center}
......@@ -162,14 +162,14 @@ As expected, solving entailment instances is more expensive
than solving satisfiability instances.
We recall from Section~\ref{sec:ent} that the procedure for entailment queries satisfiability of several formulas.
\cspenp\ efficiently solves the benchmark instances originated from programs,
namely MEM-SAT and MEM-ENT, with the average time in 0.05 and 2.98 seconds respectively.
namely MEM-SAT and MEM-ENT, with the average time in 0.05 and 3.05 seconds respectively.
% ESOP 21 Reviewer 3: In the evaluation, since many examples are automatically generated, it would be interesting to see what happens as the size of the example is increased. Of course, one understands the worst-case complexity from the theoretical sections, but whether performance degrades quickly in practice is less obvious given that this is a real implementation of the work which includes some heuristics for improving performance in practice.
Table~\ref{tab-exp} shows that some entailment instances
are challenging for \cspenp. %\todo{speed if size increase}
For instance, the maximum time in MEM-ENT and RANDOM-ENT suites are
9.05 and 17.99 seconds respectively.
For instance, the maximum time in MEM-ENT suite is
9.98, and there are 7 timeout instances in RANDOM-ENT suite.
%In order to pinpoint the potential reasons behind the long running time,
%we inspected these two entailment instances manually and find that
By inspected these challenging instances, we found that
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment