Content area
Formal proof of security measure effectiveness and computation security is vitally important for trust in critical information systems. It should be realized that formal security verification must be carried out at each infrastructural level (from the hardware level to the application level) in the process of system design. Currently, computation security analysis on the application level remains the major challenge as it requires complex labeling of computing environment elements. Traditionally, to solve this problem, information flow control (IFC) methods are employed. Unlike access control mechanisms widely used in modern operating systems (OSs) and database management systems (DBMSs), IFC has limited application in software design and mostly comes down to trivial taint tracking. This paper describes an approach to full-fledged implementation of IFC in PL/SQL program units with the use of the PLIF platform. In addition, a general scheme of computation security analysis for enterprise applications that work with relational DBMSs is considered. The key advantage of our approach is the explicit separation of functions between software developers and security analysts.
INTRODUCTION
Requirements for computer systems of particular security classes are determined by the corresponding evaluation standards. The first of these standards was the Department of Defense Trusted Computer System Evaluation Criteria [1]. Nowadays, security requirements are formulated in terms of the Common Criteria for Information Technology Security Evaluation [2] and are divided into functional requirements and trust requirements. The analysis shows that there are three important categories of conditions that need to be taken into account when determining the security class of a computer system. In the general case, they can be formulated as follows.
1. Formal proof of the effectiveness of security mechanisms (modules that implement security functions) or computation security (information processing).
2. Covert channel control.
3. Control of legal information propagation environments and paths (access control in the case of operating systems).
In the following discussion, for clarity, we conventionally divide an abstract computer system into three levels: hardware, system, and application. We assume that the hardware level includes workstations (personal computers and servers) and network equipment.
In practice, the complexity of satisfying the conditions motioned above increases from the hardware level to the application level.
At the hardware level, legal information flows include the users of a computer system who have access to dedicated devices and interact with computer equipment via standard input-output interfaces. In practice, this interaction is successfully controlled by implementing certain organizational and technical measures: limiting the number of individuals who are allowed to access to certain rooms where dedicated computing equipment is installed, setting access control systems (ACSs), installing video surveillance and alarm systems, implementing mechanisms that block entrances and windows, etc. Control of covert acoustic, electromagnetic, and optical channels at the physical level (they are also called technical channels) is implemented by specialized information protection hardware.
Data transmission in computer networks is carried out in accordance with network protocol specifications. Legal information propagation environments and paths at this level are restricted by filtering network traffic based on IP addresses, port numbers, more rarely based on useful content of network packets, and network segmentation; at the channel level, this is done by strict binding of network devices to a local network or switch ports (MAC filtering and port security). Possible covert channels in memory (transfer of data in packet headers and tunneling) are controlled by intrusion detection systems, data leak prevention (DLP) systems, etc. To deal with more complex covert channels associated with manipulation of signal parameters, special hardware and software information protection tools are used.
Filesystem and database objects are standard containers for storing information at the system level. Access control is the main mechanism that ensures safe operation with these objects. Modern operating systems (OSs) include access control mechanisms based on verified models, e.g., the mandatory entity-role model of access and information flow security control in operating systems of the Linux family (MROSL DP-model) [3]. Covert channel control at the system level is basically carried out in the same way. The implementation of an OS security monitor in this case involves the classification of processes into trusted and untrusted ones. Trusted processes implement security functions and do not cooperate with untrusted processes in illegal information flows. The possibility of violating the integrity of programs that initiate trusted processes is excluded. In our opinion, this approach provides only a partial solution to the problem of covert channel control in system software. In fact, covert channels occur when there are any effects in a data transmission (processing) environment that can be exploited by a potential intruder. Not all effects that occur in system software are associated with filesystem objects, databases (DBs), or network sockets. Examples of these effects are the terminal state of a program, time delays in the process of computation, and the intensive use of system resources, system cache, and DB cache (structures that can participate in interprocess communication). The second problem, which was noted in [4], is the trust in non-system services that perform declassification (controlled disclosure of information).
The major challenge is the verification of computation security at the application level. In practice, this problem is not completely solved even in the case of legal information propagation environments and paths. In software, these flows can be extracted from the control flow graph, provided that its integrity is guaranteed. Traditionally, to ensure the confidentiality and integrity of data during their processing, software developers use combinations of formal, semi-formal, and informal methods at different stages of system development. These methods include dynamic and static analysis, symbolic (cosymbolic) execution, fuzz testing, and formal verification. Moreover, additional security measures (dynamic memory offset randomization, stack execution protection, control flow integrity check, etc.) are implemented at the platform and compiler levels. The checks implemented using the methods mentioned above generally do not take into account specifics of data to be processed and the business logic of applications; however, they play a significant part in ensuring the integrity of algorithms. As mentioned above, methods based on information flow control (IFC) take into account specifics of data. Full-fledged implementation of IFC in language platforms has not yet been achieved for a number of reasons (see [4]). However, when developing trusted systems, a similar mechanism—taint tracking—becomes increasingly used (sometimes it is considered as the lowest form of IFC method). A significant limitation of taint tracking is that it can be applied only to explicit information flows:
In turn, implicit information flows [5], which are also legal, can easily be used to bypass this type of analysis:
void copy (tainted char *src,
untainted char *dst , int len) {
untainted int i, j ;
for (i = 0; i < len; i++) {
for (j = 0; j < sizeof(char)*256; j++) {
if ( src[i] == (char)j )
dst[i] == (char)j ; //illegal
}
}
}
Thus, taking into account the above discussion, Table 1 expresses the current possibility of satisfying the conditions critical for designing trusted computer systems. Other requirements, e.g., for data backup [6], are regarded as auxiliary one in terms of complexity of implementation in a particular system.
Table 1. . Practical implementation of the trust requirements
Hardware level | System level | Software level | |
|---|---|---|---|
Control of legal flows and information environments | + | + | |
Covert channel control | + | – | |
Formal proof of effectiveness of information security mechanism | + | – |
This work is a continuation and expansion of the investigation presented in [4]: it details the procedure for detecting illegal information flows in PL/SQL program units with the use of the PLIF platform [7] developed with the participation of the author of this paper. Its main idea consists in the formal verification (using the TLC model checker) of TLA+ specifications generated based on program units that implement critical computations. Taking into account the well-known limitation of the basic method (the significant drop in performance with the expansion of the state space [8]), it is assumed that, to adequately abstract the modules to be checked, critical computations will be shifted by the developers to a separate level, in our case, to the level of PL/SQL stored procedures (Fig. 1). The choice of PL/SQL is due to several reasons: closeness to data, relatively small amount of code (usually not more than 60 lines for one procedure), absence of auxiliary computations, built-in mechanism for identifying direct and indirect dependences, etc. Information flows from validated stored procedures and functions to corresponding GUI elements or output streams in the external software layer can be strictly controlled by static taint tracking. However, this requires further investigation. In addition, this paper also describes a general scheme of computation security analysis for enterprise applications that work with relational database management systems (DBMSs). The general information about the PLIF platform is presented in the next section, followed by an example of its use.
Fig. 1. [Images not available. See PDF.]
Choosing a suitable level of abstraction.
PLIF: GENERAL INFORMATION
The implementation of the IFC mechanism, in our opinion, must include the following steps:
– the choice of a language to describe security policies (access levels, access labels, roles, privileges, etc.);
– the definition of semantics security;
– the development of a mechanism to check security conditions, including the proof of its correctness (enforcement);
– implementation;
– the investigation of its effect on code writing and the integration of the analysis into the software development process.
Active research in these directions has been conducted for several decades. The corresponding review was presented in [4]. As for PLIF, these aspects can be described as follows.
Language for security policy description. For this purpose, we chose a subset of the Paralocks language [9]. Its main advantages are the possibility to embed data declassification conditions into policy statements, flexibility, and compatibility with various access control methods (role-based, mandatory, etc.).
In the general case, as shown in [9], a security policy is defined by a formula of predicate logic, which can be represented as a conjunction of Horn clauses: ..., where is a policy statement of form or , ..., . Here, is a predicate that defines a data flow to ( is a bound variable or constant that represents a user) and are conditions (locks) that must be satisfied for to take the value. Predicates can be parametric or nonparametric.
For further discussion, it is convenient to use the clausal form for policy statements: ..., .
The description of policies in the model (TLA+ specifications) requires the introduction of the following constants: is a set of admissible names for bound variables, U is a set of actors (system users), is a set of names for nonparametric locks, and is a set of names for parametric (single-parameter) locks.
For example, let us consider the following policy clause: a flow to an arbitrary user is possible if (a) lock is open (role is set for ) and lock t_ is open (specified period of time is expired) or (b) lock is open (role is set for ). Logically, it can be represented as follows: Flow(x)) . In the specification obtained using PLIF [7], this expression has the following form:
{⟨x , ⟨[t_expire ↦ {}],
[guest ↦ {x}, reviewer ↦ {NONE},
manager ↦ {NONE}, organizer ↦ {NONE}]⟩⟩,
⟨x , ⟨[t expire ↦ {NONE}],
[guest ↦ {NONE}, reviewer ↦ {NONE},
manager ↦ {x}, organizer ↦ {NONE}]⟩⟩}
Here, , , manager, , and is a special value indicating that the lock does not need to be opened. A policy is defined as a set of two-dimensional tuples. Each tuple is an individual policy clause; its first element is a bound variable or constant that denotes the user to whom the information flow is allowed, while its second element is a two-dimensional tuple that describes a set of locks to be opened. The elements of this tuple are records; their fields correspond to the names of the locks defined by model constants, while the field values define sets of actual lock parameters. In TLA+, records, tuples, and arrays can be interpreted as functions. Since TLA+ does not support partially defined functions, special values have to be used for the locks not included in policy clauses. To facilitate the analysis of traces that lead to illegal information flows, PLIF uses a utility with a graphical user interface: plifparser. Policies are represented in accordance with a simplified notation; in this notation, the example under consideration is represented as follows:
The partial order relation on a set of policies is defined as follows: , if : (1). Logically, can be interpreted as (2). It was shown in [9] that condition (1) is necessary and sufficient for expression (2) to be true. Comparison of policy clauses in [9] was described algorithmically, in terms of a set of rules. However, the definition of the partial order on this set can have a formal closed form.
Suppose that and C2 = , then if , where is the operation of renaming bound variables in terms (locks) of a corresponding policy clause, while is the most general unifier (MGI) of and if u1 and u2 are bound variables or u2 is a constant. The MGI determination operation is standard; its detailed description can be found in the literature on the resolution method.
The greatest lower bound of policy statements is defined as a union of their clauses: .
The least upper bound is generally defined as follows:
In this case, when searching for the MGI, no constraints are imposed on the parameter of .
The variants described in [9–11] of using Paralocks to encode policies of computing environment elements that are successive to existing access control mechanisms (e.g., the role-based one) allow us to conclude that it is possible to make assumptions that simplify the description of the corresponding entities in TLA+ specifications. Suppose that the set of names of bound variables includes one element. Suppose also that, if a bound variable occurs in one of the locks of some policy clause, then this variable is an argument in the literal of of the same clause; in addition, if a bound variable is an argument of of some policy clause, then all parametric locks of this clause also take this variable as an argument.
Taking into account these assumptions, as well as the fact that is a single-parameter lock and it can take either a bound variable or a constant as its parameter, we have
Using the simplified notation, Table 2 shows some examples of using the comparison operator on the set of policies, while Table 3 illustrates the calculation of the least upper bound.
Table 2. . Example of the use of operator
FALSE | ||
FALSE | ||
TRUE | ||
TRUE |
Table 3. . Example of the use of the LUB operator
Taking into account the above discussion, the set of possible policy clauses (ClausesSet) and the set of policies (PoliciesSet) are defined in the Paralocks.tla specification [7] as follows:
ClausesSet
{c ∈ {⟨u, ⟨e0, e1⟩⟩
: u ∈ (U ∪ UU),
e0 ∈ [E0 → SUBSET {NONE}],
e1 ∈ [E1 → ((Subset (U ∪ UU))
\ {{}}) ∪ {{NONE}}]} :
∨ c[1] ∈ UU
∨ ∧ c[1] ∉ UU
∧ (UNION Range(c[2][2]))
∩ UU = {}}
PoliciesSet
{p ∈ subset ClausesSet :
∀ c1 ∈ p :
∃ c2 ∈ p :
∧ c1 ≠ c2
∧ ∨ (compareClause(c1, c2)
∨ compareClause(c2, c1))}
∪ {{}}
The definitions of operators and are not considered in this paper. The thus-defined finite set of policy clauses forms a complete lattice. The proof of the corresponding properties was carried out using the TLAPS proof system in /plif_specs/plif_lattice/Paralocks.tla (see [7]).
THEOREM Reflexivity
∀ p ∈ PoliciesSet : comparePol(p, p)
THEOREM Transitivity
∀ p1, p2, p3 ∈ PoliciesSet :
∧ comparePol(p1, p2)
∧ comparePol(p2, p3) ⇒ comparePol(p1, p3)
THEOREM Antisymmetry
∀ p1, p2 ∈ PoliciesSet :
∧ comparePol(p1, p2)
∧ comparePol(p2, p1) ⇒ p1 = p2
THEOREM ParalocksLattice
∀ p1, p2 ∈ PoliciesSet :
∧ comparePol(p1,LUB(p1, p2))
∧ comparePol(p2,LUB(p1, p2))
∧ ∀ y ∈ PoliciesSet :
∧ comparePol(p1, y)
∧ comparePol(p2, y)
⇒ comparePol(LUB(p1, p2), y)
Semantics security. As the basic semantic notion, we use progress-insensitive noninterference (PINI) [12]. This scheme has several specific features as compared to the strict scheme of noninterference (see more details in [4]). Nevertheless, for integrity, we provide its formal definition.
Definition 2.1. Program satisfies the property of progress-insensitive noninterference for the initial and final mappings of the set of variables to the set of security labels——if, for any two states and of the computing environment that are in low equivalence with respect to some confidentiality level for : (a) each computation step is accompanied with the generation of the same observed values with respect to or leads to divergence for both the states; (b) the corresponding final states and are in low equivalence with respect to for (Fig. 2):
2.1
Fig. 2. [Images not available. See PDF.]
PINI scheme.
Here, means the equivalence of the observed series of values (behaviors) with respect to level (taking declassification into account). The concept of data declassification is considered in more detail below, as well as in [4]. It is assumed that two states of the computing environment are in low equivalence with respect to a specified confidentiality level if all pairs of same-name elements with labels not exceeding have the same values. Condition (b) is easy to check for individual PL/SQL expressions and instructions by using the abstract semantics rules specified for them. This condition is important for the formal proof of computation security in the general case under conditions of flow-sensitivity and absence of constraints on the number of executable units in one session [4]. An important role in the definition is also played by the initial and final mappings of the set of elements to the set of security labels (policies); hereinafter, we refer to them as abstract states. For instance, in the simplest case where there is only one session in the system and the set of program units is bounded by only one procedure
PROCEDURE proc_1 ( hi number) IS
var 1 number;
f1 utl__file.file_type;
BEGIN
select col_1 into var_1 from Tab1;
f1 :=
utl__file.file_open('DIR', 'file name' ,wb);
utl_file.put(f1, var_1);
update Tab1
set col_1 = hi
where id = n;
END;
in which external object file_name has immutable policy , the initial policy of global variable col_1 is ,1 and the policy of formal parameter hi is , the absence of illegal information flows in a single run of the procedure allows one to make a conclusion about computation security only for the initial abstract state , in which global variable col_1 has policy rather than . That is why the secure composition of a program unit with itself is not possible. The proposed mechanism for checking the computation security conditions takes this fact into account (see Section 3).
Security condition check. As described in [4], the condition check mechanism is based on the formal verification of program units with the use of the model checking method. As the language platform for specifications, we use TLA+. Its important advantages are the full-fledged support of temporal logic, which enables more accurate description of system behavior, and the possibility to check properties of transitions, which is important for the proposed model checking algorithm (see Section 3). Transition property CompInv used in this algorithm can be expressed as follows:
VPolUnchanged
LET CompInv_OP1(x , y)
∧ x
∧ comparePol(VPol[y].policy, VPol′[y].policy)
∧ comparePol(VPol′[y].policy, VPol[y].policy)
IN FoldSet(CompInv_OP1, TRUE, DOMAIN VPol)
CompInv ◻[VPolUnchanged]vars
State variable contains policies of global variables (columns). In addition, the TLC tool is used to check TLA+ specifications by using the model checking method.
The abstract semantics of information flows in program units executed in parallel user sessions was described in [4]. Its rules are supplemented with expressions inv, which represent necessary checks at certain stages of computation. For instance, the rule (C-IF)implies that, when executing a conditional operator, the policy of element (program counter) is supplemented with the policy p of conditional expression e; then, an equiprobable transition is made either to instruction or to instruction , and the abstract state of the computing environment does not change. For convenient manipulation with the policy of with regard to possible nesting, it is represented as a tuple of individual values. Obviously, the actual value of the policy for each session is calculated as the least upper bound of the expressions included in the corresponding tuple. Constraint (which is checked manually) is necessary to exclude illegal probabilistic information flows. In turn, the rule (C-END-IF)indicates only the weakening of the policy upon exiting the condition.
A general description of the transition system that models parallel computing in DB sessions can also be found in [4]. The implemented analysis, as noted above, is thread-sensitive.
An important step of the check is the automatic translation of PL/SQL code into specifications. This problem is solved using the developed utility plsql2tla, which is based on the approach proposed in [13].
When a security invariant or transition property is violated, the suspicious trace and information flow graph can be investigated using the tool. This tool is developed based on [14]. Section 4 shows an example of its use.
GENERAL PROCEDURE OF INVESTIGATION
In the general case, the procedure of information flow analysis in an industrial information system built on the basis of a DBMS is shown in Fig. 3.
Fig. 3. [Images not available. See PDF.]
General procedure of investigation.
It is recommended to design a DB in such a way that confidential data are compacted in a limited number of tables. This recommendation does not contradict the generally accepted principles of secure development; at the same time, it allows one to more effectively isolate critical computations from public PL/SQL code.
The next stage involves the analysis of dependences and the selection of a relevant set of program units. At this stage, the PL/SQL units related to the manipulation of confidential data are determined. An important feature of modern industrial DBMSs is the built-in mechanism for identifying direct and indirect dependences.
The key and most complex stages of the procedure are the generation and labeling of specifications, model checking, and handling security invariant violations. These stages are considered below by a particular example, with the focus being placed on the model checking algorithm (Fig. 4). The preliminary step of the algorithm (initialization) involves the initialization of the constants that determine the number of sessions, the set of particular users, the set of names for bounded variables, the sets of parametric and nonparametric locks, etc. In the process of model checking, two properties are checked: the main security invariant ParalocksInv (which guarantees the fulfillment of condition (a) from Definition 2.1 for individual instructions and expressions) and the transition property CompInv (which guarantees the equivalence between the initial and final mappings of the set of global variables to the set of policies). When ParalocksInv is violated, the data are declassified; in some cases, it requires correction of code or change in privileges that control access to DB objects (which is discussed below). When the CompInv property is violated, the policy of the corresponding global variable (column) is raised for the initial state of the model.2 Reducing policies of global variables to stationary values allows us to check the fulfillment of condition (a) from Definition 2.1 for individual instructions and PL/SQL expressions for all possible initial and final abstract states, which, in turn, is required to prove the correctness of the model checking algorithm.
Fig. 4. [Images not available. See PDF.]
Model checking algorithn.
It was proved in [4] that the successful termination of the algorithm guarantees computation security in terms of PINI for an unlimited number of program units executed in each user session.
The final stage of the procedure is the analysis of the propagation of output values of critical procedures and functions in application software.
PLIF: AN EXAMPLE
To test PLIF, we use an example from [4]. Suppose that we have a conference management system. Users of the system submit their papers, which are reviewed and distributed over sections. Oracle DBMS implements a role-based access control mechanism. The hierarchy of roles is shown in Fig. 5.
Fig. 5. [Images not available. See PDF.]
Role lattice.
The main business functions in the system are implemented using PL/SQL program units. The general security requirements are formulated, e.g., as follows: any registered user can submit a paper to a conference. Abstracts are analyzed by reviewers, who, however, cannot relate papers with their authors. The program of a conference is developed by the manager, who can read the status (approved or not) of any paper, whereas the other users can access statuses of paper only after a specified period of time. The conference program can be viewed by all users. The authors of the papers must remain hidden from all users until a certain period of time is expired.
Suppose that direct access to tables in the system is prohibited, and the interaction is possible only through stored procedures and functions. Suppose also that, to support the described security requirements at the DB level, the following object privileges are set:
grant execute on p_add_paper to public;
grant execute on p_submit_paper to public;
grant execute on p_chahge_status to reviewer;
grant execute on p_allocate to manager;
grant execute on f_getsection_program to public;
grant execute on f_getpaper to public;
grant execute on f_is_accepted to public.
Upon completion of the DB design stage, in accordance with the procedure described above (see Fig. 5), it is required to determine the tables used to store confidential data, as well as identify the DB objects that depend on them. For this purpose, Oracle uses a special mechanism. For instance, to determine direct and indirect dependences associated with the Submissions table (which contains the statuses of the submitted papers), we can use the following commands:
execute deptree_fill('TABLE', 'CONFERENCE',
'SUBMISSIONS');
select * from deptree.
Suppose that, at the second stage, a relevant set of DB objects is determined. It includes the following relationships: Papers, Submissions (accepted papers), Conferences, Sections, Allocations (distribution of papers over sections), and Logs (error log) (see Fig. 6), as well as the following program units: p_add_paper (adds a new paper to the database), p_submit_paper (approves a paper for participation in a conference, i.e., adds a row to the Submissions table), p_change_status (changes the value of the field for a particular row in the Submissions table upon reviewing the corresponding paper), p_allocate (adds the information about a paper to the Allocations table after a series of checks), f_getsection_program (returns the program of a selected section; to generate the report, it accesses tables Papers and Allocations), f_get_paper (returns the information about a selected paper), and f_is_accepted (checks the status of a paper and returns TRUE if the paper is accepted). Specification generation uses the plsql2tla utility. In this paper, this stage is not detailed.
Fig. 6. [Images not available. See PDF.]
DB structure.
Next, it is required to perform specification labeling based on the rules of the global security policy. Table 4 shows a possible result of this stage. Let us consider it in detail.
Table 4. . Specification labeling
Program unit | Input values | Formal parameters | Local variables | Output values |
|---|---|---|---|---|
create or replace procedure p_submit_paper s_id number p_id number, c_id number, sub_date_date, stat number) is begin … end p_add_paper; | i1:= | s_id := | ||
i2:= | p_id := | |||
i3:= | c_id := | |||
i4:= | sub_date := | |||
i5:= | stat := | |||
create or replace. procedure p_add_paper (p_id number, tit varchar2, abstr varchar2, t clob, auth varchar2) is begin … end p_add_paper; | i1:= | p_id:= | ||
i2:= | tit := | |||
i3:= | abstr := | |||
i4:= | t := | |||
i5:= | auth := | |||
create or replace procedure p_change_status (s_id number, stat number) is begin …; end p_change_status; | i1:= | s_id := | ||
i2:= | stat := | |||
create or replace procedure p_allocate (id number, s_id number, sec_id number, alloc_date number) is p_not_accepted exception v_p_id exception v_is_acc exception begin … end p_allocate; | i1:= | id := | p_not_accepted:= | |
i2:= | s_id := | v_p_id := | ||
i3:= | sec_id := | v_is_acc := | ||
i4:= | alloc_date:= | |||
create type paper_type …; create or replace function f_get_paper (p_id number) return paper_type as v_paper paper_type begin … return v_paper; end f_get_paper; | i1:= | p_id:= | v_paper_c1:= | r_rec_c1:= |
v_paper_c2:= | r_rec_c2:= | |||
v_paper_c3:= | r_rec_c3:= | |||
v_paper_c4:= | r_rec_c4:= | |||
v_paper_c5:= | r_rec_c5:= | |||
create or replace function f_is_accepted (s_id number) return boolean is v_status number; begin select status into v_status from submissions where submission_id = s_id; if v_status = 1 then return true; else return false; end if; end f_is_accepted; | i1:= | s_id := | v_status:= | r:= |
create type paper_type as object(…); create type program_arr_type as varray(1000) of paper_type; create or replace function f_get_section_program (s_id number) return program_arr_type as (v_program) program_arr_type; begin … end; | i1:= | s_id:= | v_prog_ar_e1_c1:= | r_ar_e1_c1:= |
v_prog_ar_e1_c2:= | r_ar_e1_c2:= | |||
v_prog_ar_e1_c3:= | r_ar_e1_c3:= | |||
v_prog_ar_e1_c4:= | r_ar_e1_c4:= | |||
v_prog_ar_e1_c5:= | r_ar_e1_c5:= | |||
v_prog_ar_e2_c1:= | r_ar_e2_c1:= | |||
v_prog_ar_e2_c2:= | r_ar_e2_c2:= | |||
v_prog_ar_e2_c3:= | r_ar_e2_c3:= | |||
v_prog_ar_e2_c4:= | r_ar_e2_c4:= | |||
v_prog_ar_e2_c5:= | r_ar_e2_c5:= |
First, based on the requirements of the global policy and the hierarchy of DB roles (Fig. 5), the sets of parametric and nonparametric locks are defined. Suppose that {“t_expire”} and {“guest,” “reviewer,” “manager,” “organizer”}. The DB administrator makes the p_submit_paper procedure public, which is why policy any_caller () is valid for all formal parameters:
any_caller (a)
{⟨a, ⟨[e1 ∈ E0 ↦ {NONE}],
[e2 ∈ E1 ↦ {NONE}]⟩⟩}
Here, a is an element of set (a particular user). The description of the general security policy also does not contain any constraints on any data entered by the user when calling the procedure, which is why all input values have the minimum policy :
min {⟨CHOOSE x ∈ UU :
TRUE, ⟨[e1 ∈ E0 ↦ {NONE}],
[e2 ∈ E1 ↦ {NONE}]⟩⟩}
The labeling of the p_add_paper program unit is quite similar. An exception is the input value of the auth parameter, which is assigned policy because the general requirements state that the information about authors of papers must remain hidden until a specified period of time is expired. Based on the privileges set for calling the p_change_status procedure, the policy of its formal parameters is defined as . The actual value of paper status , which is passed when calling this procedure, must be readable to the manager, and all other users after a certain period of time; thus, . In accordance with the requirements described above, the input values of the p_allocate procedure are not confidential, which is why they have the mini-mum policy. The policy of formal parameters is dictated by the object privileges set at the system level. It should be noted that this procedure also includes local elements. Local variables are always assigned the minimum initial policy; in this case, p_not_accapted is also a minimum policy. Public functions f_get_paper and f_get_section_program are of interest for describing composite and reference local variables and return values. For instance, variable v_paper and the return value of function f_get_paper have the object type consisting of five fields. Each field of the object type or record in PLIF is modeled by an individual element. Variable v_program and the return value of f_get_section_program are collections of objects. In the current version of PLIF, any collection associated with a set of DB tuples is modeled by an array of two elements. The loss of accuracy of the automated analysis, as expected, can be compensated for by the convenience of the graphical interpreter of computational traces that lead to violation of a security invariant (see below).
The key stage of the procedure is, of course, model checking and handling of security invariant violations. In this example, the algorithm requires eight iterations to terminate. The information flow graphs shown below are generated using the plifparser utility. Model checking is carried out in TLC with the following values of the main initialization parameters (constants):
U {allen, bob, allex , john}
UU {x}
E0 {“t_expire”}
E1 {“guest”, “reviewer”, “manager”, “organizer”}
GPol (“organizer” :>
{“manager”, “reviewer”, “guest”})@@
(“manager” :> {“guest”})@@
(“reviewer” :> {“guest”})@@
(“guest” :> {“guest”})
Session_number 1
Here, the GPol function defines the mapping of the set of roles to the set of subsets of roles; it corresponds to the hierarchy in Fig. 5 and allows one to automatically open dependent locks when checking the model. Session_number is the number of simulated parallel sessions.
STEP 1. ParalocksInv violation.
At the first attempt to check the model, security invariant violation takes place when executing the p_change_status_load operator (which loads the parameters and local variables of the p_change_status procedure3 to the session) (see Fig. 7). Computations at this step are carried out in accordance with the rule (C-EXT-PRC) of the abstract semantics described in [4]:
Fig. 7. [Images not available. See PDF.]
ParalocksInv is violated (p_change_status_load).
The violation is caused by the disagreement between the policies of input value , , and the formal stat parameter, . To eliminate this problem, declassification is used. The literature describes several principles and methods of data declassification in software [15]. In our case, it is assumed that the papers are reviewed anonymously and the reviewer is confined (at the application level) only to the text of a paper (without knowing the author’s name and submission number); here, the determining factor is the location of the expression that causes the problem. Hence, the detected violation can be handled by applying declassification with a place constraint (WHERE). To correct the model, it is sufficient to set the Ignore flag in the corresponding operator; there is no need to correct the code of the procedure itself.
p_change_status_load(id) … ∧ Ignore′ = 1
ParalocksInv IF Ignore ≠ 1
THEN … ELSE TRUE
Other declassification methods supported in PLIF are used to correct the model on subsequent iterations.
STEP 2. ParalocksInv violation.
Upon rerunning the model checking utility, the security invariant is violated in the transition step described by operator f_get_paper_8 (Fig. 8). The analysis of the information flow graph makes it easy to identify the flow that causes the problem: 1 – the policy of input value i5 (authors), , is passed to the auth parameter of the p_add_paper procedure (executed when loading the procedure in the bob session); 2 – the updated policy of the auth parameter is passed to global variable (column) col_authors; 3 – the f_get_paper function is loaded in the alex session; 4 – the policy of the col_authors column is passed to internal variable v_paper_c5; and 5 – the policy of v_paper_c5 is passed to return value r_rec_c5, which has stationary policy . Thus, the condition specified by the rule (C-EXT-RET) of the abstract semantics is violated:
Fig. 8. [Images not available. See PDF.]
ParalocksInv is violated (p_get_paper_8).
For an information flow to become legal at a given point, the t_expire lock can be opened. In fact, this means correcting the code of the procedure by adding a condition check:
create or replace function f_get_paper
…
is
begin
if is time expire()
then … else return null;
end p_get_paper;
Formally, in this case, the WHEN declassification is used. The correction of the model requires the use of the openLock operator, which allows one to add a new element to the set of open locks for a session:
f_get_paper_load(id)
…
∧ openLock(id, {[t_expire ↦ ALL]})
…
STEP 3. ParalocksInv violation.
The third check of the model also results in security invariant violation; this time, the error occurs in the f_is_accepted_9 operator. With this function being public, its return value has policy (any_caller). A variant of the trace leading to the error is shown in Fig. 9.
Fig. 9. [Images not available. See PDF.]
ParalocksInv is violated (f_is_accepted_9).
1, 2 – the policy of input value , , is passed to global variable col_status in the bob session. 3, 4 – the policy of global variable col_status is passed to the internal variable v_status of the f_is_accepted function in the alex session. 5 – as a result of checking the if condition, the policy of the v_status variable is passed to the pc_label policy of the alex session. 6 – the pc_label policy is passed to the return value of the f_is_accepted function, which has stationary policy . Thus, at the final step of the trace, the condition specified by the rule (C-EXT-RET) of the abstract semantics is violated.
In this case, most likely, the DB administrator made a mistake when setting access control rules. Function f_is_accepted must not be explicitly called by system users, or the access to it must be available only to the manager. The object privileges can be corrected as follows: [frame = NONE, basicstyle=] revoke execute on f_is_accepted from PUBLIC; grant execute on f_is_accepted to manager.
To correct the specification, it is required to modify the formula of the initial state () in such a way that, when procedure f_is_accepted is executed in user session s, locks and are added to the set of open locks:
Init … ∧ SLocks =
[s ∈ S ↦
[e1 ∈ E0 ↦ {}]
@@[e2 ∈ E1 ↦
CASE . . .
if f_is_accepted is called
manager(s) and guest(s) are added
into the initial set of open locks
□ ∧ Sessions[s][“StateRegs”] [1][“pc”] [1]
= “ f_is_accepted”
∧ ∨ e2 = “manager”
∨ e2 = “guest” → {s}
□ …
In this case, the policy for the return value of the function in ParametersFS.tla must also be modified:
f_ia_r(x)
[loc ↦ “mem”,
offs ↦2,
policy ↦ replace any_caller(x) with
{⟨x, ⟨[t_expire ↦ {NONE}],
[guest ↦ {NONE},
reviewer ↦ {NONE},
manager ↦ {x},
organizer ↦ {NONE}]⟩⟩},
name ↦ “ f_ia_r”]
STEP 4. ParalocksInv violation.
The fourth model check also leads to the violation of the ParalocksInv invariant. When executing operator f_get_section_program_9 (Fig. 10), the condition specified by the rule (C-EXT-RET) is violated.
Fig. 10. [Images not available. See PDF.]
ParalocksInv is violated (f_get_section_program_9).
Let us briefly describe the confidential data flow that causes the problem. 1, 2 – the policy of input value , , is passed to global variable col_status in the allen session. 3, 4, 5 – in the bob session, the p_allocate procedure is loaded; in it, function f_is_accepted is called, and the policy of global variable col_status is passed to local variable v_status. 8 – the policy of v_status is added to the pc_label policy of the bob session. 7, 8, 9 – policy from element pc_label is first passed to the output value of function f_is_accepted and then to the local variable v_is_acc of procedure p_allocate; when exiting the if statement, this policy is removed from the general policy of pc_label. 10 – policy from variable v_is_acc is again passed to pc_label. 11 – the policy of the pc_label element from the bob session is passed to global variable col_submission_id. 13, 14 – in the alex session, function f_get_section_program is called, and the policy from global variable col_submission_id is passed to its local variable v_program (which is an array of objects that consist of five fields). 15 – the policy of one of the elements from the v_program array is passed to one of the elements from the output array of function f_get_section_program, which has stationary policy in accordance with the current global access control policy. In this case, the analyst can easily notice that there are titles of papers among the values returned by the function. In other words, regardless of the fulfillment of the time expiration condition specified in the security requirements, any user can receive the list of papers with the “accepted” status. Obviously, the detected illegal information flow is not a false positive. To eliminate the problem, it is reasonable to apply the same strategy (WHEN declassification) as at STEP 2, i.e., to supplement the code of the procedure with the check
create or replace function f_get_section_program
…
is
begin
if is time expire() or is manager()
then … else return null;
end p_get_section_program;
and correct the model by adding the necessary elements to the set of open locks for the session in which f_get_section_program is called:
f_get_section_program_load(id)
…
∧ ∨ openLock(id, {[t_expire ↦ ALL]})
∨ openLock(id, {[manager ↦ id]})
…
The violation of the invariant does not always indicate the presence of a dangerous information flow. The proposed analysis, as noted above, is not absolutely accurate. However, it is assumed that any suspicious flow can easily be investigated using the plifparser graphical utility. For instance, if, in the case shown in Fig. 10, the SELECT query returned (instead of the complete information about a paper) only paper_id, which, from the analyst’s perspective, cannot be associated with any particular paper, or it returned the total number of papers in a given section, then the returned value could be declassified without additional checks (WHAT declassification):
select(id, …,
Replace the value calculated using LUB
LUB4Seq
(⟨VPol[“col_papers_paper_id”].policy,
VPol[“col_allocations_submission_id”].policy,
VPol[“col_allocations_section_id”].policy,
VPol[“col_submissions_paper_id”].policy,
VPol[“col_submissions_submission_id”].policy,
load(id, f_gsp_p_s_id(id))⟩)
with MIN policy (WHAT declassification)
min, ...)
STEP 5. CompInv violation.
Upon the correction of the model at STEP 4, there are no more violations of the main security invariant. However, enabling the CompInv property check causes an error when executing operator f_is_accepted_9 (see Fig. 11). The error correction strategy is simple (see Section 3): update the initial policy of the corresponding global variable:
Fig. 11. [Images not available. See PDF.]
Action property CompInv is violated.
Fig. 12. [Images not available. See PDF.]
Action property CompInv is violated.
Init …
∧ VPol =
[… col_papers_authors ↦
[ext ↦ 0,
policy ↦ min
{⟨u1, ⟨[t_expire ↦ {}],
[guest ↦ {NONE},
reviewer ↦ {NONE},
manager ↦ {NONE},
organizer ↦ {NONE}]⟩⟩},
name ↦ “col papers authors”], …]
There is no need to correct the PL/SQL code or the global access control policy [4].
STEPS 6–8. CompInv violation.
The subsequent three model checks lead to the violation of the CompInv property in operators p_change_status_4, p_allocate_13, and p_allocate_14 (Fig. 15). The correction of the model is carried out as at STEP 5.
Once the error is fixed, the model check is successfully completed at the eighth step.
Control of output value propagation for critical procedures and functions in application software is carried out using standard taint tracking, e.g., in the CodeQL environment [16]. In this paper, we do not consider this stage.
CONCLUSIONS
In this paper, we have detailed the procedure for detecting illegal information flows in PL/SQL program units. The possibly redundant illustrative material prepared using the plifparser utility, in our opinion, is essential for proving the implementability of the approach proposed in [4]. With certain loss of analysis accuracy due to the abstraction inevitable for formal verification, the possibility of graphical interpretation of suspicious computational traces significantly facilitates the detection of false positives.
Along with other studies devoted to the logical semantics of Paralocks policies [17], this paper has revealed some specific aspects of language implementation in TLA+ specifications. In this connection, the presence of formal proofs for the properties of an algebraic lattice built on a particular set of security policies is essential.
In the future, we intend to conduct a deeper investigation on the last stage of the proposed information flow control procedure (the analysis of output value propagation for critical procedures and functions in application software).
In addition, the works devoted to IFC widely cover the problem of data declassification; however, they pay practically no attention to the inverse procedure, data classification. The confidentiality level of the data used by a program cannot only decrease but also increase in the course of their processing.
CONFLICT OF INTEREST
The author declares that he has no conflicts of interest.
By the global variables of the computing environment, we mean attributes of relations.
2The fulfillment of the CompInv property can be achieved in a finite number of iterations, because the policy alphabet is a finite lattice and the transition functions for calculating policies of global variables are monotonically increasing.
3Specifications generated by PLIF describe the state of each user session, including the domain of local variables and parameters (stack), pointers to the current stack frame and to the instruction currently executed in the session, and the domain of return values.
Translated by Yu. Kornienko
REFERENCES
1 Latham, D.C., Department of defense trusted computer system evaluation criteria, 1986.
2 Infrastructure, P.K. and Profile, T.P., Common criteria for information technology security evaluation, 2002.
3 Devyanin, P.N.; Leonova, M.A. Use of subtypes and total functions of the formal Event-B method for description and verification of the MROSL DP model. Program. Inzheneriya; 2020; 11, pp. 230-241. [DOI: https://dx.doi.org/10.17587/prin.11.230-241]
4 Timakov, A.A. Information flow control in software DB units based on formal verification. Program. Comput. Software; 2022; 48, pp. 265-285.MathSciNet ID: 4459820[DOI: https://dx.doi.org/10.1134/S0361768822040053]zbMath ID: 1507.68108
5 Denning, E.D., A lattice model of secure information flow, Commun. ACM, 1976, no. 5, pp. 236–243.
6 Shaitura, S.V.; Pitkevich, P.N. Data backup methods for critical enterprise information systems. Ross. Tekhnol. Zh.; 2022; 10, pp. 28-34.
7 Timakov, A., PLIF, 2021. https://github.com/timimin/plif.
8 Konnov, I., Kukovec, J., and Tran, T.-H., TLA+ model checking made symbolic, Proc. ACM Programming Languages, 2019, vol. 3, pp. 1–30.
9 Broberg, N. and Sands, D., Paralocks: Role-based information flow control and beyond, Conf. Rec. Annu. ACM Symp. Principles of Programming Languages, 2010, pp. 431–444.
10 Broberg, N. and Sands, D., Flow locks: Towards a core calculus for dynamic flow policies, Lect. Notes Comput. Sci., 2006, pp. 180–196.
11 Broberg, N., Practical flexible programming with information flow control, Thesis for the Degree of Doctor of Engineering, 2011.
12 Hedin, D.; Sabelfeld, A. A perspective on information-flow control; 2012;
13 Methni, A., Lemerre, M., Hedia, B.B., Barkaoui, K., and Haddad, S., An approach for verifying concurrent C programs, Proc. 8th Jr. Res. Workshop Real-Time Computing, 2014, pp. 33–36.
14 Fernandes, A., tlaplus-graph-explorer, 2021. https://github.com/afonsonf/tlaplusgraph-explorer.
15 Kozyri, E. Expressing information flow properties. Found. Trends Privacy Secur.; 2022; 3, pp. 1-102. [DOI: https://dx.doi.org/10.1561/3300000008]
16 Kristensen, E., CodeQL, 2022. https://github.com/github/codeql.
17 Delfit, V.B., Broberg, N., and Sands, D., A Datalog semantics for Paralocks, Lect. Notes Comput. Sci., 2013, pp. 305–320.
18 Harrison, M.A.; Ruzzo, W.L.; Ullman, J.D. Protection in operating systems. Commun. ACM; 1976; 19, pp. 461-471. [DOI: https://dx.doi.org/10.1145/360303.360333]zbMath ID: 0327.68041
© Pleiades Publishing, Ltd. 2023. ISSN 0361-7688, Programming and Computer Software, 2023, Vol. 49, No. 4, pp. 215–231. © Pleiades Publishing, Ltd., 2023. Russian Text © The Author(s), 2023, published in Programmirovanie, 2023, Vol. 49, No. 4.