Sort | SMT-LIB | C++ API | |
---|---|---|---|
String Primitives | |||
String literal | String |
Any sequence of printable ASCII characters enclosed in double quotes (“” ),
such as “example123” . A double-quote can be escaped as \” .
See the SMT-LIB 2.5 specification for details. |
Z3_mk_str |
String variable | String |
(declare-const X String) , or equivalently,
(declare-fun X () String) |
|
String concatenation | String x String -> String |
(str.++ "exa" "mple") |
Z3_mk_str_concat |
String length | String -> Int |
(str.len "abc123") |
Z3_mk_str_length |
String equality | String x String -> Bool |
(= X "abc") |
|
Higher-Level Functions | |||
Character at index | String x Int -> String |
(str.at "abc" 2) |
Z3_mk_str_at |
Substring | String x Int x Int -> String |
(str.substr "abcdef" start count) |
Z3_mk_str_substr |
Prefix predicate | String x String -> Bool |
(str.prefixof sPrefix sFull) |
Z3_mk_str_prefixof |
Suffix predicate | String x String -> Bool |
(str.suffixof sSuffix sFull) |
Z3_mk_str_suffixof |
Contains predicate | String x String -> Bool |
(str.contains sHaystack sNeedle) |
Z3_mk_str_contains |
Index of occurrence | String x String x Int -> Int |
(str.indexof sHaystack sNeedle iStart) |
Z3_mk_str_indexof |
Single replacement | String x String x String -> String |
(str.replace sBase sTarget sReplacement) |
Z3_mk_str_replace |
String to integer conversion | String -> Int |
(str.to-int "123") |
|
Integer to string conversion | Int -> String |
(str.from-int 123) |
|
Regular Expressions | |||
String to regex (match literal) | String -> Regex |
(str.to.re "abc123") |
Z3_mk_str_to_regex |
Regular language membership | String x Regex -> Bool |
(str.in.re S R) |
Z3_mk_str_in_regex |
Regex concatenation | Regex x Regex -> Regex |
(re.++ R1 R2) |
Z3_mk_regex_concat |
Regex union | Regex x Regex -> Regex |
(re.union R1 R2) |
Z3_mk_regex_union |
Regex Kleene star | Regex -> Regex |
(re.* R) |
Z3_mk_regex_star |
Regex plus | Regex -> Regex |
(re.+ "abc123") |
Z3_mk_regex_plus |
Regex character range | String x String -> Regex |
(re.range "a" "z") |
Z3_mk_regex_range |
Bit-Vector Integration (Chapter TBD) | |||
Library-Aware String Methods (Chapter TBD) |
String literals are expressed as a sequence of ASCII printable characters enclosed in double quotes.
In SMT-LIB 2.5 syntax, a double quote is escaped inside a string as ""
. All other escape sequences are interpreted in the usual way,
e.g. the sequence \t
is expanded to a tab character, the sequence \\
is expanded to a single backslash, etc.
In addition, the character sequence \xNN
is a hexadecimal escape, where NN
is a two-digit hexadecimal constant
(using any of 0-9a-fA-F as hex digits) and the sequence is converted to the single ASCII character having the given hexadecimal value.
Examples:
The following are all valid SMT-LIB 2.5 string literals:
"abc"
"123;456"
"say ""hello"" everyone"
The following is also a valid SMT-LIB 2.5 string literal:
"field1\tfield2"
It is expanded to the following sequence of characters (spaces added for clarity):
f i e l d 1 TAB f i e l d 2
The string literals “x41” and “A” represent identical string constants, even though they are written two different ways.
Variables may be declared with standard SMT-LIB 2.5 syntax, i.e.
(declare-const X String)
for a variable X of sort String
,
or equivalently, (declare-fun X () String)
.
Two strings may be concatenated with the str.++
operator.
The interpretation of this operator is equal to a string formed by appending
the characters in the second argument to the characters in the first argument,
keeping them in the same relative order.
Examples:
(str.++ "abc" "def") ::= "abcdef"
(str.++ "123" "") ::= "123"
The length of a string may be expressed using the str.len
operator.
The length of a string is equal to the number of characters in the string.
This length is of sort Int
and as a result
is an arbitrary-precision natural number.
Examples:
(str.len "") ::= 0
(str.len "abcdef") ::= 6
For information on alternative length semantics that model fixed-precision overflow such as in C/C++ string operations, please refer to Chapter TBD.
Equality between any two terms of String
sort may be asserted with the =
operator.
Two strings are considered equal iff they contain the same characters appearing in the same order.
Examples:
(= (str.++ "abc" "") (str.++ "" "abc")) ::= True
(= (str.++ "abc" "def) ("abcdefGHI)) ::= False
The term (str.at Str Idx)
selects the single character at integer index Idx
in the string Str
. The first character in a (non-empty) string occurs at index 0.
If Idx
is less than 0 or would index a character past the end of the string,
the value of this term is the empty string ""
. Otherwise,
the value of this term is a string of length 1 corresponding to the index character in Str
.
Examples:
(str.at "abcde" 2) ::= "c"
(str.at "abcde" 5) ::= ""
(str.at "abcde" -1) ::= ""
The term (str.substr Str Start Len)
denotes the substring of Str
of length Len
,
taken beginning at index Start
.
If Start
is less than 0 or would denote a position past the end of the string,
or if the resulting substring would extend past the end of Str
,
the value of this term is undefined.
Examples:
(str.substr "abcde" 1 3) ::= "bcd"
The term (str.prefixof Pre Full)
is a predicate that is true iff
the string Pre
is a prefix of the string Full
.
More precisely, (str.prefixof Pre Full)
is true iff there exists a string S
such that Full == (str.++ Pre S)
.
Examples:
(str.prefixof "abc" "abcde") ::= True
(str.prefixof "cde" "abcde") ::= False
The term (str.suffixof Post Full)
is a predicate that is true iff
the string Post
is a suffix of the string Full
.
More precisely, (str.suffixof Post Full)
is true iff there exists a string S
such that Full == (str.++ S Post)
.
Examples:
(str.suffixof "cde" "abcde") ::= True
(str.suffixof "abc" "abcde") ::= True
The term (str.contains Haystack Needle)
is a predicate that is true
iff the string Haystack
contains the string Needle
.
More precisely, (str.contains Haystack Needle)
is true iff
there exist strings S
and T
such that
Haystack == (str.++ S (str.++ Needle T))
.
Examples:
(str.contains "abcde" "bc") ::= True
(str.contains "abcde" "ae") ::= False
The term (str.indexof Haystack Needle Start)
is the integer position of the first occurrence of the string Needle
in the string Haystack
at or after the integer index Start
,
if any. If Needle
is not a substring of Haystack
,
or Start
is either negative or denotes a position beyond the end of Haystack
,
or Needle
is the empty string, then the value of this term is -1.
Examples:
(str.indexof "AbcAbcAbc" "Abc" 2) ::= 3
(str.indexof "AbcAbcAbc" "xyz" 0) ::= -1
(str.indexof "AbcAbcAbc" "Abc" 10) ::= -1
The term (str.replace Base Target Replacement)
is equal to the string obtained by replacing the first occurrence of
Target
in the string Base
with the string Replacement
.
If the string Target
does not occur in Base
,
the term is equal to the unchanged string Base
.
Examples:
(str.replace "abcXYZdeXYZf" "XYZ" "999") ::= "abc999deXYZf"
(str.replace "ababc" "ab" "") ::= "abc"
(str.replace "abcdef" "xyz" "999") ::= "abcdef"
The term (str.to-int Str)
is equal to the
integer formed by the digits of the string Str
.
Conversion to an integer is performed as follows:
Str
is the empty string, the value of this term is -1.Str
contains any non-digit characters, the value of this term is -1.Str
.Note that leading zeroes are permitted but are effectively ignored.
Examples:
(str.to-int "123") ::= 123
(str.to-int "00123") ::= 123
(str.to-int "nine thousand") ::= -1
(str.to-int "-50") ::= -1
The term (str.from-int N)
is equal to the
minimal base-10 string representation of the integer N
.
Conversion to a string is performed as follows:
N
is less than 0, the value of this term is the empty string ""
.N
is equal to 0, the value of this term is the string "0"
.S
containing only the digits 0 through 9,
does not contain any leading zeroes,
and is a base-10 string representation of N
, i.e. it satisfies (str.to-int S) == N
.Examples:
(str.from-int 123) ::= "123"
(str.from-int 0) ::= "0"
(str.from-int -50) ::= ""
Note that in all regular expression terms, subexpressions of sort Regex
must be constant expressions
(i.e. they may not contain free variables, conditional expressions, etc.).
The term (str.to.re S)
defines the regular language whose only member is the string S
.
The subexpression S
must be a string constant or constant expression; otherwise, an error is raised.
The term (str.in.re S R)
denotes the predicate that the string S
is a member of the regular language R
.
Examples:
(str.in.re "abc" (str.to.re "abc")) ::= True
(str.in.re "A" (re.++ (str.to.re "A") (str.to.re "B"))) ::= False
(str.in.re "A" (re.union (str.to.re "A") (str.to.re "B"))) ::= False
The term (re.++ R1 R2)
denotes the regular language formed by concatenating any string in R1
with any string in R2
.
Examples:
(re.++ (str.to.re "A") (str.to.re "B")) ::= (str.to.re "AB")
The term (re.union R1 R2)
denotes the regular language that is the union of all strings in R1
and all strings in R2
.
Examples:
(re.union (str.to.re "A") (str.to.re "B"))
is the regular language {"A", "B"}
.
The term (re.* R)
denotes the Kleene closure of the regular language R
, i.e., the language of all finite-length strings that can be formed
by concatenating zero or more arbitrary members of R
.
Examples:
(re.* (str.to.re "A"))
is the regular language {"", "A", "AA", "AAA", ...}
.
The term (re.+ R)
denotes the language of all finite-length strings that can be formed by concatenating at least one or more
arbitrary members of R
. Unlike the Kleene star, the empty string is not a member of this language.
Examples:
(re.+ (str.to.re "A"))
is the regular language {"A", "AA", "AAA", ...}
.
The term (re.range C1 C2)
denotes a regular language containing each element in a range of ASCII characters.
The strings C1, C2
must be string literals of length 1. The character range described is all ASCII characters
from C1
up to and including C2
. It is not valid if the ASCII value of C2
is smaller than that of C1
.
Examples:
(re.range "a" "z")
is the regular language containing the characters {"a", "b", "c", ..., "x", "y", "z"}
.