Eﬀective reduction of cryptographic protocols speciﬁcation for model-checking with Spin

In this article a practical application of the Spin model checker for verifying cryptographic protocols was shown. An eﬃcient framework for specifying a minimized protocol model while retaining its functionality was described. Requirements for such a model were discussed, such as powerful adversary, multiple protocol runs and a way of specifying validated properties as formulas in temporal logic


Introduction
A flaw in a cryptographic protocol may become a real security thread [1,2]. Even a seemingly small protocol may produce a great number of possible behaviours. One of the methods to formallycon sider protocols correctness is model checking by representing the protocol as Büchi automata M , specifying every checked property as an LTL temporal formula α and checking satisfiability of the formula in the model M |= α [3,4,5,6].
The automata of the protocol is typically generated from a more high-level description. This article has its focus on representing protocol models in the : Additional data -the information required for logical assertions that are written to check protocol properties must be stored. Also additional informations about protocol state are printed out and used later while producing a counterexample. : Model configuration -description of a particular model configuration, should specify any constraints in the way that the messages are sent from the user to the user and the roles the users can take.
These specifications are responsible for models proper behaviour. While the above constraints hold, one important parameter must be minimized: : Models size -affects the amount of memory and time needed for verification. Considering the exponential complexity of the model checking problem (O(#(M )) = O(2 #(P) ), where P is the number of atomic prepositions describing the states of model M [19]), this seems to be a critical issue in practical applications.

Representing Protocol as an Automaton
To illustrate the idea of modelling protocols as Büchi automata, an example is given in this section, showing a path from a protocol description up to the automaton. A clear, simple protocol is used (Fig. 1). Also no reduction techniques have been demonstrated yet. This keeps the model comprehensible so that the reader can understand the general methodology. The verification process consists of the following steps.
(1) Modelling protocol -the verifier describes in the Promela language the behaviours of the protocol users and all the possible actions the adversary can take. A sample code representing the responder in the example protocol is shown in Fig. 1. (2) Protocol as automaton -the Promela code describes an automaton.
A gard and an action are associated with every transition from state to state. In the automaton in Table 1 by copying the state labels onto the outgoing arcs [6], which can be seen in Fig. 4. (5) The verified property -all the desirable properties of the protocol are written down as LTL logic formulas. The formulas contain references to variables from the protocol model. Each formula is negated to denote the unsafe states and automatically transformed into special never process in the Promela code with Spin or another tool [20], as shown in Fig. 5. This code can be also transformed into the Büchi automaton. Locations represented as double framed circles are accepting locations. The automaton accepts an infinite input if it makes the automaton visits accepting states infinitely often [5,6]. (6) Verification algorithm -at the end an asynchronous product of all automata representing protocol users is constructed. This automaton is used to construct a synchronous product with the formula automaton [6]. The algorithm is to search the resulting automaton for a path that would traverse infinitely often through the formula automaton accepting locations [4,6]. (7) Counterexample -such a path indicates an error in the protocol and presents a way an unsafe state can be reached. On the whole, the protocol is flawed if its model can produce a path, that is accepted by the automaton representing an undesirable situation.
The human verifier takes part only in the stages involving modelling the protocol in the Promela language and writing LTL logic formulas. Other activities are done automatically by the model checker tool. Actually effective implementations merge the described stages to reduce computing costs.

Our approach
The most intuitive way to model protocol is to represent users as independent processes, sending messages through channels controlled by the intruder.  Unfortunately, such a model, though properly describing the protocol, might be too large to analyze. Due to exponential complexity of the problem [19], every redundancy in the model is expensive by means of memory and computation time.

Kripke structure
So the ability to model a protocol is not sufficient for practical verification. Thus the constructions below were used in the presented model to reduce its complexity, while giving the intruder strong abilities.

Remembering Simple Message Elements
Simple elements known by the intruder are remembered as bytes in the EveDB array. Every element has unique value and can be accessed with a combination of defined indices. The values for the JFKi protocol are shown in the left column of Fig. 6. The example of access to elements can be found in the right column of Fig. 6. For instance the index of responders nonce nonr, is a sum of index indicating user identifier, nonce type and current protocol run (otherU ser + NONCE + comm). On the other hand, exponentials are reused between protocol runs so to access them the comm variable indicating the run is not used. If the EveDB array cell is not empty, the value is known by the attacker.

Remembering Complex Message Elements
Complex elements such as signatures and encryptions are stored by the intruder in additional channels which work like FIFO queues. While generating a faked message, needed elements are randomly chosen from channels. The examplary usage was shown in Fig. 6 (right). Channel EveSig2 holds signs from the second protocol message, that were intercepted earlier.

Eavesdrop On Send, Corrupt on Receive Tactic
In a simple model the message is produced by the legal user, the intruder learns it and then the message is sent. Yet before the receiver gets it, the data is intercepted and generated once more by the attacker on the basis of their knowledge. An observation can be made, that it makes no sense to transport via the channels the data that is already stored in the intruder database. It can be seen that the original message is not used after the intruder learns it. Therefore in our approach channels transport only information that a message is sent as shown in Fig. 7. In consequence, all channels memory usage size is constant and small. Thus the tactic is crucial for minimizing the model size.
It is also important that in our model the intruder can produce faked message after arbitrary time, possibly after receiving other messages from parallel protocol runs and learning new data. This models the ability of the intruder to delay messages. In Fig. 7, a circle is a point where a message is consumed by the attacker, while a square marks creation of a message by him. As can be seen message M1' is produced after learning message M2 from the second protocol run. Sending of message is also the point where the intruder decides where the message will be sent. The effect is achieved by combining attackers' activities with the users' steps, rather than putting them into a separate process. As the method name suggests the intruder takes his first action (eavesdropping) just after the legal user sends a message. The instructions are put into the sender process. The attackers' second action (corrupting the message) is put into receivers' process, just before the legal user gets a message.
The tactic also eliminates introduction of additional mechanism to prevent the intruder from intercepting his own, faked messages. This could have been an additional field in a message, indicating if the message was sent by the attacker that can be found in the literature [10]. With the tactic this is not needed, as the data is generated only once before the legal user receives it.

Only One Channel For a Message
Using for every message two channels (first for transporting data from the legal user to the intruder, second for transporting data to the legal receiver) is simple and intuitive but memory expensive. Thus only one channel is used in our model. This can be done as no message data is really transported as was mentioned. Only information that a message is sent is placed in channels.

All Users in One Process
The eavesdrop on send, corrupt on receive technique also makes it possible to place the code of all users in one process. As it was mentioned, the intruders' actions are combined with those of legal users. In a simple model senders and receivers could be put in independent processes. Every step of a user consists Unauthenticated Download Date | 9/25/15 2:52 PM Pobrane z czasopisma Annales AI-Informatica http://ai.annales.umcs.pl Data: 05/11/2022 00:24:05 U M C S of receiving and/or sending a message. Such step would be put into an atomic clause to minimize interleavings we are not interested in. This would result in a sequence of users' atomic steps from different protocol runs, which is the models proper behaviour.
Yet the presence of many processes would cause the model checker to create an asynchronous product of the automata. This would introduce redundant interleaving and make the verified model grow too much [3]. That is why only one process is used with a do loop, in which from the set of executable steps one is nondeterministically chosen. Every step is represented by a function to keep both the advantage of one process and of having structured the Promela code, as shown in Fig. 7. Rather than storing the user identity in his process state, it becomes the function parameter. For example, a receiver of the message is indicated by the self parameter of function recvMSG1sendMSG2().
In such a model a sequence of nondeterministically chosen steps is produced just as in the multi-process case but without the undesirable overhead. So this approach does not affect the models functionality but its efficiency.

Consistent Message Generation by the Intruder
The consistent generation of messages means that once chosen, an element (e.g. nonce, exponential) is used by the intruder in the whole message. This helps avoid messages that are known to be rejected by legal users. The example of this was shown in Fig. 6 (right). Here the same value is used as exponential of a responder expr in the plain text and in the faked signature.

Protocol Properties Verification
The last step is specifying protocol properties as LTL (linear temporal logic) formulas. Notation α means that α is satisfiable in the model, iff it is true for every execution path of the automata. It can be used to specify that it is desirable that unsafe states are never reachable. For the Needham-Schroeder protocol, an example safety formula detecting identity misbiding would be: The wrong state is when one of the legitimate users accepts a session with another legal user (IDA or IDB), while this user did not, because he was engaged in a run with the intruder (IDE). So it should hold that M |= ¬α. Fig. 8 presents a readable counterexample for the attack. It was automatically produced by a simple driver written by the authors, that runs the model checker, parses Spins output and interprets it. Only the emphasis was added by hand for more readability. The required information, from the raw output of the model checker, originates from the printing commands shown in Figs 1 and 6. Form the listing it can be seen that Bob accepted a session with Alice who never took part in a run with Bob. Another issue about writing formulas is the labels mechanism. It should be used if possible because it avoids additional, global variables to mark a state. Labels in Promela are inserted into code just as in C language. Expression of the form (P rocessN ame@LabelN ame) used in a formula, will discover a point where the process is in the labelled state.
As for the JFKi protocol the following two examplary formulas are presented.

γ = (JF KiP rotocol@IN T RU DER DECRY P T ED MSG3 LABEL
The first formula is used to detect privacy violation attack, when Eve decrypts the third message that was not supposed for her (global variable cert stores certificate of the peer chosen by the initiator and it is not IDE).
The second formula is true if the responder accepts a wrong initiators security association. That is when the association was inserted by the intruder (SaiE), although it should originate from a legal initiator (whose identity is kept in cert). There should never happen a situation when these formulas are true, so the Büchi automata are built for the formulas ¬ γ and ¬ ψ. With analogical formulas, the ability of the adversary to change the exponentials, nonce and Diffie-Hellman group information can be checked. During the verification of JFKi protocol with Spin, none of the attacks was detected.

Application of the Method and Computational Results
The following model instance configuration was used for the verification results below: two parallel runs, two legal protocol users, intruder knowledge database containing two elements (Needham-Schroeder ) or one element (JFKi) of every type of the complex element. In the second case, to give the adversary more abilities, any received complex element is stored in the databases nondeterministically. So the first element may not fill the queue completely. This configuration makes it feasible to verify a protocol on an average computer (AMD Athlon 2.01GHz, 2GB RAM) and lets expect the standard attacks to be detected.
Costs of example protocols verification are shown in Table 2. Our models are indicated bold. Sources of model from [10] are available, so scaled down to two parallel runs, they were included as a comparison. Also publicized fragments of [15] model give a hint of its size. At the first sight it is visible that the unminimized models present much bigger state vectors. In the case of JFKi, it can be very distinctly seen how beneficial for verification were the reductions of the model. The original automaton was much too complex and was only partially analyzed. The minimized model could be verified in less than a quarter of hour. Protocol security properties did hold in the JFKi model.

Conclusions and Future Plans
The proposed modelling framework has proved to be computationally efficient, enabling verification of more complex protocols. Although the approach is more work consuming than using tools specialized only in verification of cryptographic protocols such as Casper [21,2], yet it gives more control over the model configuration. Another fact is that the discussed method is far more readable than for example CSP [2] or clauses for the ProVerif program [22]. Hence it is more accessible for the inspection and less error prone. In addition, automatic generation of counterexamples is a true asset of the method.
The presented framework is a proposal of a method complementary to the existing ones, being aimed at solving the difficult problem of assuring correctness of safety protocols.
As for the future improvements, designing a methodology to divide the model would make it possible to verify more parallel runs of a protocol. For example, in each part the initiator would choose a different responder. Analysis of each such model would require less memory and could be possibly done concurrently on separate computers, saving the time.
A more complex task would be to create a parser that would transform a protocol specification, in a protocol description language such as CAPSL [23], into a model. The Promela code could be still edited by the human verifier if needed but the main work would be done automatically. This offers another opportunity, that from the same input many outputs can be generated, including several verification models or a protocol implementation [23,17].