Full Text

Turn on search term navigation

(ProQuest: ... denotes non-US-ASCII text omitted.)

Julie Yu-Chih Liu 1,2

Academic Editor:Hui-Shen Shen

1, Department of Information Management, Yuan Ze University, Chung Li, Taoyuan 320, Taiwan
2, Innovation Center for Big Data & Digital Convergence, Yuan Ze University, Chung Li, Taoyuan 320, Taiwan

Received 16 January 2014; Revised 5 August 2014; Accepted 7 August 2014; 21 August 2014

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Database normalization plays a crucial role in the design theory of relational database to avoid insertion and deletion and update anomalies in a database. The database normalization involves decomposition of a relation schema (table) into several smaller ones. The essential requirement of the decomposition is lossless join property, which ensures that the original relation can be obtained from its decomposed results via combination operations [1]. Several methods have been proposed to design normalized relation schemes based on the keys and functional dependencies of a relation to achieve lossless join decomposition [2, 3]. The design theory has been applied to fuzzy databases, in which uncertain and imprecise information can be represented and manipulated. The fuzzy databases are extended from the classical databases based on fuzzy sets and possibility theory [4], and they can be resemblance-based fuzzy model [5, 6] and possibility-based fuzzy model [7, 8]. In the context of fuzzy databases, fuzzy functional dependency (FFD) has emerged to extend the classical functional dependency to represent functional relationships between classes/attributes of objects for fuzzy database models. Various FFD definitions have been proposed in some fuzzy data models for database normalization [9, 10].

However, very few research methods discuss lossless join property for the normalization in possibility-based fuzzy databases. To achieve lossless join decomposition by using FFDs for the possibility-based fuzzy databases is more difficult than for the resemblance-based fuzzy databases, especially for extended possibility-based fuzzy database. The extended possibility-based fuzzy database [7] is an extension of possibility-based fuzzy database [8] by including a resemblance-based fuzzy model [6]. In the fuzzy database, attribute values could be the possibility distributions of the attribute on its domain. Additionally, the elements in a domain have some degree of resemblance. Previous work has applied FFDs on the decomposition for the fuzzy database [10-12]. Informally, these FFDs are based on a certain degree of similarity between two attribute values. Namely, two tuples that are similar but not identical might be regarded as redundant. Applying the similarity-based FFDs on relation decomposition prompts the difficulty for lossless join decomposition on two facets: (i) redundancy removal: how to eliminate redundant tuples that are not identical from the decomposed results so that the results can be, later on, used to produce the original relation without losing information and (ii) tuple merging: how to combine two relations via merging their tuples of which attribute values are similar but not identical.

Complicating this problem further, most similarity measures [7, 13, 14] of values in the form of possibility distribution are not transitive . When tuple redundancy is determined by the similarity measures of nontransitivity, the result of eliminating redundant tuples from a decomposed relation might be not unique (or order sensitive ¹ ). An inconsistent data redundancy removal not only leads to unstable results of data integration, as described in [15], but also causes decomposition results not lossless. When the decomposition result of a relation is not unique, the combination of the result will have many different outcomes, at least one of which is different from the original relation. Accordingly, the decomposition inevitably violates the lossless join property. Moreover, the nonunique results occur for relation combination when the attribute values to be joined/merged have similarity relation of nontransitivity.

To avoid nontransitivity, Chen et al. provided FFD with embedded classical FD [11, 16], where redundancy removal is restricted to duplicate tuples. But, this restriction draws the normalization process back to the traditional operations of crisp data. To obtain transitive relationship among tuples, some research applied the max-min transitive closure on the relationship matrix of similarity degree between tuples [17]. The max-min transitive closure of a relationship matrix must be a matrix with max-min transitivity [18]. By referring to the transitive closure of the relationship matrix, the tuples which have similarity higher than a given threshold can be grouped into disjointed sets. The tuples in the same set were regarded as redundant. However, this approach cannot determine the similarity of two tuples by merely examining these two, and the similarity is changed by inserting or deleting other tuples. The nondeterministic and dynamic characteristic is not applicable to the practice of databases.

To our knowledge, very few studies provide a complete guideline to perform normalization that ensures lossless join decomposition in the fuzzy databases. Therefore, the purpose of this study is to fill up this gap. This study first proposes a notion of approximate equality which represents the transitive equivalent relation among tuples. Then, it provides new definition of FFD and lossless join decomposition based on approximate equality for the fuzzy databases. Both functional dependencies and lossless join decomposition in a traditional database are special cases in this proposal. Examples show that the notion is more applicable than other similarity concepts to the research related to the fuzzy databases. This work also provides the method of achieving the lossless join decomposition for the fuzzy databases.

The remainder of this paper is organized as follows. Section 2 gives a brief introduction to database normalization and fuzzy database and the survey of the similarity measures related to the fuzzy database. Section 3 demonstrates the problem of using nontransitive similarity measures for determining tuple redundancy and provides a notion of approximate equality for it. The FFD is then defined based on the approximate equality in Section 4. Besides, the lossless join decomposition is proposed for the fuzzy databases, and its property is proven as well. Section 5 draws the conclusion of this paper.

2. Preliminaries

This section first briefly reviews the essential operations for lossless join decomposition in traditional databases. Then, it introduces the fuzzy databases considered in this work and the similarity measures of values in form of possibility distribution.

2.1. Essential Operations for Lossless Join Decomposition

In traditional relational database, a row is called a tuple; a column header is called an attribute; and the table is called a relation. Given an m -ary relation schema R ( _{A 1} , _{A 2} , ... , _{A m} ) , an instance of R denoted by r ( R ) is the set of all tuples in R . Let Α denote a set of attributes _{A 1} , _{A 2} , ... , _{A m} . A functional dependency FD _{A i} [arrow right] _{A j} existing in R ( A ) represents the tuples having the same values on attribute _{A i} that must be identical on _{A j} , where _{A i} , _{A j} ∈ A . Two operations are related to the lossless join decomposition: projection and natural join . The operation projection generates a result by selecting certain attributes from given relation and removing redundant tuples. Let Θ denote a set of attributes in R ( A ) ; that is, Θ ⊂ A . The result of projection R over attributes Θ ⊆ A is _{Π Θ} ( R ) = { t [ Θ ] |" t ∈ r ( R ) } , where t [ Θ ] represents the composite of values on Θ in tuple t . The natural join (denoted by * ) of ^{R [variant prime]} ( X Y ) and ^{R [variant prime][variant prime]} ( Y Z ) is obtained by removing duplicate attribute from the results of equal join on joined attribute Y and is denoted as shown below: [figure omitted; refer to PDF] The natural join and projection operations are, respectively, used to combine and decompose relations.

Formally, decomposition { _{R 1} , _{R 2} , ... , _{R k} } of R is lossless join if equation r ( R ) = _Π_{R 1} ( R ) * _Π_{R 2} ( R ) * ... * _Π_{R k} ( R ) holds. In other words, the lossless join decomposition ensures that the combination of the decomposed results of a relation has no spurious tuple or missing tuple to the relation via natural join operation [1].

For example, given a relation R , the results of ^{R [variant prime]} = _Π_{A 1}_,_{A 2} ( R ) and ^{R [variant prime] [variant prime]} = _Π_{A 2}_,_{A 3} ( R ) are shown in Table 1.

Table 1

	R	^{R [variant prime]}	^{R [variant prime] [variant prime]}
	_{A _ 1 _}	_{A 2}	_{A 3}	_{A _ 1 _}	_{A 2}	_{A 2}	_{A 3}
_{t 1} :	x	p	m	x	p	p	m
_{t 2} :	y	p	m	y	p	q	n
_{t 3} :	z	q	n	z	q

In this case, the decomposition { ^{R [variant prime]} , ^{R [variant prime] [variant prime]} } of R has lossless join property because the natural join result of ^{R [variant prime]} and ^{R [variant prime] [variant prime]} is exactly the same as R (as shown in Table 2).

Table 2

	equal join of ^{R [variant prime]} and ^{R [variant prime] [variant prime]} on _{A 2}	^{R [variant prime]} * ^{R [variant prime] [variant prime]}
	_{A _ 1 _}	_{A 2}	_{A 2}	_{A 3}	_{A _ 1 _}	_{A 2}	_{A 3}
_{t 1} :	x	p	p	m	x	p	m
_{t 2} :	y	p	p	m	y	p	m
_{t 3} :	z	q	q	n	z	q	n

2.2. The Fuzzy Databases

In last two decades, fuzzy concepts have been incorporated in traditional databases [5, 8, 19] and applied to measure the relation between data [20-22]. The fuzzy databases enable dealing with imprecision and uncertainty in the real world based on the theory of fuzzy sets and possibility distribution theory. The possibility-based fuzzy theory has been widely applied in environmental management, such as flood-diversion planning [23], water resources management [24], and air quality management [25]. This work considers the extended possibility-based databases proposed by Chen et al. [7] because it can capture both the possibility-based fuzzy model and the resemblance-based fuzzy concept. The fuzzy database has drawn much attention of research on semantic measures, information processing, update operation, and UML class diagram therewith [20, 26, 27]. The data model of the fuzzy databases is a hybrid of a possibility-based data model in [8] and a resemblance-based data model in [6]. The possibility-based model derives from Zadeh's fuzzy theory. In the theory [4], a fuzzy set F on a universe of discourse U is described by { _{μ F} ( u ) / u |" u ∈ U } , where _{μ F} : U [arrow right] [ 0,1 ] is a membership function for the fuzzy set F and _{μ F} ( u ) denotes the degree of membership of u in F . In a possibility-based database [8], the value of an attribute A on a domain D is a possibility distribution _{π A} = { _{π A} ( u ) / u |" u ∈ D } , where _{π A} ( u ) denotes the possibility that u is the actual value of A . For example, D = { _{u 1} , _{u 2} , _{u 3} } and _{π A} = { 0.8 / _{u 1} , 0.5 / _{u 2} , 0.6 / _{u 3} } . An example of applying the possibility-based fuzzy theory ² in real world is shown below. Consider a domain of attribute "eye color" is {black, brown, blue, green} and a possibility distribution is given below: [figure omitted; refer to PDF] Suppose that John's eye color is an "Asia color." Then, according to the interpretation for possibility-based fuzzy theory, one concludes that the possibility of John's eye being brown blue color is 0.3.

In the extended possibility-based database, attribute values are represented by possibility distributions of an attribute on its domain, and a domain is associated with a similarity relation of domain elements. Formally, an m -ary relation instance r on a schema R ( _{A 1} , _{A 2} , ... , _{A m} ) in the fuzzy database is a subset of Cartesian product of Φ ( _{A 1} ) × Φ ( _{A 2} ) × ... × Φ ( _{A m} ) , where Φ ( _{A i} ) represents a set of all possibility distributions of attribute _{A i} on its domain. For a domain _{D i} , a proximity relation is given to describe the resemblance between domain elements in _{D i} . A proximity is a mapping _{s i} : _{D i} × _{D i} [arrow right] [ 0 , 1 ] with reflexivity and symmetry; that is, _{s i} ( u , u ) = 1 and _{s i} ( u , v ) = _{s i} ( v , u ) . The elements in a domain cannot directly be partitioned into disjoint equivalent classes by a threshold cutting on the proximity relation for the domain elements.

To acquire equivalent classes of a proximity relation on a domain, Shenoi et al. [18] proposed α -proximate relation. Two elements u , v ∈ D are α -proximate (denoted by u ^{S α +} v ) if s ( u , v ) > α or there exists a sequence _{w 1} , _{w 2} , ... , _{w r} ∈ D , Such that min ... { s ( u , _{w 1} ) , s ( _{w 1} , _{w 2} ) , ... , s ( _{w r - 1} , _{w r} ) , s ( _{w r} , v ) } > α . Given a proximity relation s and a threshold α for domain D , the domain can be partitioned into disjoint subsets (called α -proximate equivalent classes) such that the elements in a partition are α -proximate. The equivalent classes are regarded as basic concepts for the methods being reviewed or proposed hereinafter.

By extending traditional functional dependency, research has proposed variety of fuzzy functional dependencies (FFDs) for fuzzy databases [14, 21, 28]. The FFDs are determined by the degree of similarity of attribute values rather than by the identity. Several similarity measures of attribute values are proposed for the extended possibility-based frameworks [7, 16, 20, 29]. Most of them provide the estimates within the interval [ 0,1 ] . The similarity measures are briefly restated hereinafter, in which s and α ∈ [ 0,1 ] , respectively, denote the proximity relation and a threshold defined on a given domain D = { _{u 1} , _{u 2} , ... , _{u n} } ; _{π A} and _{π B} represent two possibility distributions on D . The degree of closeness between _{π A} and _{π B} , denoted by _{ξ 1} ( _{π A} , _{π B} ) , is defined as follows [29]: [figure omitted; refer to PDF] where D ^ = { _{u i} ∈ D : _{π A} ( _{u i} ) > 0 } ∪ { _{u i} ∈ D : _{π B} ( _{u i} ) > 0 } .

The measure of _{ξ 1} may give low degree of similarity for two values that are very similar to each other, for example, _{ξ 1} ( { 0.9 / excellent , 1 / good } , { 1 / good } ) = 0.1 . To prevent some counter-intuitive estimates of _{ξ 1} , Chen et al. defined the possibility that _{π A} = _{π B} is true as shown below [7] (here ⋀ denotes minimum): [figure omitted; refer to PDF] This assessment is widely adopted in the extended possibility-based databases and is adoptable for the application with subnormal distribution (i.e., _{ξ 2} ( _{π A} , _{π A} ) < 1 , or see [4] for details). For normal distribution, Chen et al. [16] included identity relation (denoted by _{= id} ) into (4) as follows: [figure omitted; refer to PDF]

Ma et al. defined the similarity measure from the perspective of the semantic closeness between two attribute values [20] as shown below: [figure omitted; refer to PDF] where δ denotes a semantic inclusion degree. Consider the following: [figure omitted; refer to PDF] The notion of _{ξ 4} may violate the convention that the similarity degree of two values lies within [ 0,1 ] ; _{ξ 4} ( _{π A} , _{π B} ) = 1.52 for the case that _{π A} = { 0.9 / excellence , 0.8 / good } and _{π B} = { 0.7 / excellence , 0.6 / good } when the similarity of "excellence" and "good" is larger than the given threshold; that is, s ( excellence , good ) ...5; α . It is difficult to set up a proper threshold for estimates that range out of [ 0,1 ] , having an unpredictable upper bound.

Liu et al. [13] extended the semantic equivalence to ensure that the result of similarity measure lies within [ 0,1 ] . The measurement adjusts the possibility distributions of values based on α -proximate equivalent classes of the domain before measuring their similarity. Let C = { _{C 1} , _{C 2} , ... , _{C r} } be the α -proximate equivalent classes of domain D . The adjusted value of possibility distribution _{π *} is defined as follows: [figure omitted; refer to PDF] where _{u - * k} = ^{∑ j = 1 ,}^{_{u j}}^∈^{_{C k}}^|^{_{C k}}^| _{π *} ( _{u j} ) / | _{C ^ k} | and _{C ^ k} = { _{u j} ∈ _{C k} : _{π *} ( _{u j} ) > 0 } , k = 1 , ... , r . Then, [figure omitted; refer to PDF]

Although the methods mentioned above differ from each other on measuring similarity of attribute values, most of the methods of measuring the similarity of tuples are the same. The methods adopt the minimum of the similarity of each pair of attribute values. Given tuples t = ( _π_{A 1} , _π_{A 2} , ... , _π_{A m} ) and ^{t [variant prime]} = ( ^π^{_{A 1}}^{[variant prime]} , ^π^{_{A 2}}^{[variant prime]} , ... , ^π^{_{A m}}^{[variant prime]} ) , the resemblance of tuples t and ^{t [variant prime]} , denoted by η ( t , ^{t [variant prime]} ) , is given by [figure omitted; refer to PDF] where _{ξ *} could be either _{ξ 1} , _{ξ 2} , _{ξ 3} , _{ξ 4} , or _{ξ 5} in (3)-(9). Tuples t and ^{t [variant prime]} are redundant to each other if η ( t , ^{t [variant prime]} ) ...5; α , where α is a given threshold. The similarity measure of tuples has been applied to extract representative tuples for reducing information redundancy [17].

Fuzzy functional dependency (FFD) is a concept derived from traditional FD. Both FFD and FD have several applications on databases, for example, redundancy elimination [30], missing data prediction, fuzzy data compression [17, 31], and lossless join decomposition [10, 28, 32]. In literature, various FFDs are defined for different fuzzy data model. For some fuzzy data representation, FFDs are defined based on the equivalence classes of tuples, such as the similarity-based fuzzy data model [33]. In the extended possibility-based databases, the definition of FFD is also of variety, such as literature [10, 14, 34, 35]. One example among the FFD definitions in the literature is listed below.

Definition 1 (see [10], fuzzy functional dependency).

Let X ~ > Y denote that attribute X is fuzzy functional which depends on attribute Y in a relation R . The FFD: X ~ > Y holds in the instance r ( R ) if and only if η ( t [ X ] , ^{t [variant prime]} [ X ] ) ...6; η ( t [ Y ] , ^{t [variant prime]} [ Y ] ) for every t , ^{t [variant prime]} ∈ r ( R ) .

The example helps in understanding the problem of applying the FFDs on relation decomposition in the fuzzy databases illustrated in Section 3.

3. Redundancy Removal and Tuple Merging

Several factors determine whether the relation decomposition possesses the lossless join property. They are the ways to decompose a relation, to remove redundant tuples, and to combine the decomposed results. Redundancy removal is to eliminate redundant tuples. If the similarity measures used to measure tuple redundancy are not transitive , the result of redundancy removal could be nonunique. An example of nontransitivity is that tuples ^{t [variant prime]} and ^{t [variant prime] [variant prime]} are redundant to each other, and ^{t [variant prime]} and ^{t [variant prime] [variant prime]} are redundant as well, but t and ^{t [variant prime] [variant prime]} are not redundant. In this case, the result of redundancy removal will be { t , ^{t [variant prime] [variant prime]} } if ^{t [variant prime]} is deleted first, which differs from the one-tuple result (either { t } or { ^{t [variant prime] [variant prime]} } ) when first deleting the tuples other than ^{t [variant prime]} . The nontransitivity makes the result of redundancy removal order sensitive and hinders the lossless join decomposition.

Nevertheless, most well-defined similarity measures [7, 10, 20, 29] for the values of possibility distribution are reflexive and symmetric but not transitive. For example, consider adopting (4) to measure the similarity of tuples. Given three values _{π A} = { 0.6 / _{u 1} , 1.0 / _{u 2} , 0.6 / _{u 3} } , ^{π A [variant prime]} = { 1.0 / _{u 1} , 1.0 / _{u 2} , 0.6 / _{u 3} } , and ^{π A [variant prime][variant prime]} = { 1.0 / _{u 1} , 0.6 / _{u 2} , 0.6 / _{u 3} } , then we have _{ξ 2} ( _{π A} , ^{π A [variant prime]} ) = 1 , _{ξ 2} ( ^{π A [variant prime]} , ^{π A [variant prime][variant prime]} ) = 1 , and _{ξ 2} ( _{π A} , ^{π A [variant prime][variant prime]} ) = 0.6 . Considering tuples t = ( _{π A} ) , ^{t [variant prime]} = ( ^{π A [variant prime]} ) , and ^{t [variant prime][variant prime]} = ( ^{π A [variant prime][variant prime]} ) , we have η ( t , ^{t [variant prime]} ) ...5; α and η ( ^{t [variant prime]} , ^{t [variant prime][variant prime]} ) ...5; α but η ( t , ^{t [variant prime][variant prime]} ) < α for any α > 0.6 according to (10). Thus, the similarity measure of tuples is not transitive.

In generalizing projection and equal join operations of traditional database to fuzzy databases, when the redundancy removal is order sensitive, it is hard to obtain lossless join decomposition. Consider the case that Y ~ > Z holds in the instance r ( R ) of relation R ( X , Y , Z ) based on Definition 1; namely, η ( t [ Y ] , ^{t [variant prime]} [ Y ] ) ...6; η ( t [ Z ] , ^{t [variant prime]} [ Z ] ) for every t , ^{t [variant prime]} ∈ r ( R ) . Assuming that r ( R ) consists of three tuples Y9; _{x 1} , y , z YA; , Y9; _{x 2} , ^{y [variant prime]} , ^{z [variant prime]} YA; , Y9; _{x 3} , ^{y [variant prime] [variant prime]} , ^{z [variant prime] [variant prime]} YA; and X is a key attribute, it is possible that the two values in each of pairs ( y , ^{y [variant prime]} ) , ( y , ^{y [variant prime] [variant prime]} ) , ( z , ^{z [variant prime]} ) , and ( z , ^{z [variant prime] [variant prime]} ) are redundant to each other, but ( ^{y [variant prime]} , ^{y [variant prime] [variant prime]} ) and ( ^{z [variant prime]} , ^{z [variant prime] [variant prime]} ) are not. Since Y ~ > Z , R should be decomposed to avoid redundancy. After decomposing R ( X , Y , Z ) to ^{R [variant prime]} ( X , Y ) and ^{R [variant prime] [variant prime]} ( Y , Z ) , if tuple Y9; y , z YA; is first removed because it is redundant to tuple Y9; ^{y [variant prime]} , ^{z [variant prime]} YA; , the result of ^{R [variant prime] [variant prime]} ( Y , Z ) contains two tuples Y9; ^{y [variant prime]} , ^{z [variant prime]} YA; and Y9; ^{y [variant prime] [variant prime]} , ^{z [variant prime] [variant prime]} YA; . The natural join of ^{R [variant prime]} ( X , Y ) and ^{R [variant prime] [variant prime]} ( Y , Z ) generates a four-tuple result Y9; _{x 1} , y , ^{z [variant prime]} YA; , Y9; _{x 1} , y , ^{z [variant prime] [variant prime]} YA; , Y9; _{x 2} , ^{y [variant prime]} , ^{z [variant prime]} YA; , Y9; _{x 3} , ^{y [variant prime] [variant prime]} , ^{z [variant prime] [variant prime]} YA; , which contains spurious tuple.

To resolve this problem, this study proposes the operations of projection and equal join for the fuzzy databases, which involves evaluation of redundancy and tuple merging. Since the decomposition of relations is based on FFD, it depends on the similarity of tuples. For the data in the fuzzy model, (3)-(9) can be used to measure the similarity of tuples and define FFDs in the fuzzy databases. However, (5) restricts redundant tuples to those duplicate. Equations (3), (4), and (6) lack transitivity. Therefore, this work adopts (9) and (10) to define approximate equality for the tuples that might not be identical but have high similarity degree. The approximate equality enables obtaining a unique result of redundancy removal.

Definition 2 (approximately equal tuples).

Two tuples t = ( _{π 1} , _{π 2} , ... , _{π m} ) and ^{t [variant prime]} = ( ^{π 1 [variant prime]} , ^{π 2 [variant prime]} , ... , ^{π m [variant prime]} ) are approximately equal, denoted by t [congruent with] ^{t [variant prime]} , if it is satisfied that _{min ... j = 1 , ... , m} { _{ξ 5} ( _{π j} , ^{π j [variant prime]} ) } = 1 .

In other words, tuples t and ^{t [variant prime]} are approximately equal if their similarity η ( t , ^{t [variant prime]} ) = 1 .

Lemma 3.

The approximate equality of tuples (or attribute values) is transitive.

Proof.

Based on (9), it is obvious that if _{ξ 5} ( _{π A} , _{π B} ) = 1 and _{ξ 5} ( _{π B} , _{π G} ) = 1 , then _{ξ 5} ( _{π A} , _{π G} ) = 1 . Thus, if t [congruent with] ^{t [variant prime]} and ^{t [variant prime]} [congruent with] ^{t [variant prime][variant prime]} , then t [congruent with] ^{t [variant prime][variant prime]} based on (10).

The tuples of approximate equality are considered to be redundant to each other. The notion of approximate equality can be applied to query processing with the predicate containing fuzzy concept [36] for fuzzy databases in different models. For simplicity, we let _{π A} [congruent with] _{π B} denote _{ξ 5} ( _{π A} , _{π B} ) = 1 hereafter.

Example 4.

Given values _{π A} = { 0.75 / pretty , 0.65 / cuteness } and _{π B} = { 0.6 / pretty , 0.7 / charm , 0.8 / cuteness } on domain D and equivalent classes _{C 1} = { pretty } and _{C 2} = { charm , cuteness } for D , the average possibilities of _{π B} are _{u - B}₁ = 0.6 / 1 = 0.6 and _{u - B}₂ = ( 0.7 + 0.9 ) / 2 = 0.8 , yielding _{π ~ B} = { 0.6 / pretty , 0.8 / charm , 0.8 / cuteness } . Likewise, _{u - A}₁ = 0.8 , _{u - A}₂ = 0.65 , and _{π ~ A} = { 0.75 / pretty , 0.65 / charm , 0.65 / cuteness } . We have δ ( _{π A} , _{π B} ) = ( 0.6 + 0.65 + 0.65 ) / ( 0.6 + 0.8 + 0.8 ) = 0.86 and δ ( _{π B} , _{π A} ) = 1.9 / 2.05 = 0.92 . Thus, _{ξ 5} ( _{π A} , _{π B} ) = min ... { 0.86 , 0.92 } = 0.86 . Given _{π G} = { 0.6 / pretty , 0.65 / charm , 0.85 / cuteness } on D , we have _{π G} [congruent with] _{π B} even though _{π G} is not identical to _{π B} .

Proposition 5.

The approximate equality can be used to classify values of the fuzzy database into disjoint sets (equivalence classes).

Proof.

Based on the definition of (9), it is obvious that _{ξ 5} is reflective and symmetric ; that is, _{ξ 5} ( _{π A} , _{π A} ) = 1 and _{ξ 5} ( _{π A} , _{π B} ) = _{ξ 5} ( _{π B} , _{π A} ) for values _{π A} and _{π B} . Besides, approximate equality is transitive according to Lemma 3. Therefore, two different sets of approximately equal values are either disjoint sets or same class sets, where any two of the values are approximately equal to each other.

The transitivity of similarity measure is important to any operation involving redundancy removal or tuple merging. Besides, the measure of transitivity can be applied to clustering methods or data groupings, such as the ones in [36, 37].

Proposition 6.

Given _{π A} and its adjusted value _{π ~ A} following (8), _{π A} [congruent with] _{π ~ A} .

Proof.

It is obvious by the definition of (9).

Buckles and Petry first proposed the way of tuple merging and applied it to remove redundant tuples in a fuzzy database [5]. Tuple merging can also be used at join operation. This study extends the tuple merging of Chen et al. [16] to be (11) for relation combination as well as redundancy removal. Given tuples t = ( _π_{A 1} , _π_{A 2} , ... , _π_{A m} ) and ^{t [variant prime]} = ( ^π^{_{A 1}}^{[variant prime]} , ^π^{_{A 2}}^{[variant prime]} , ... , ^π^{_{A m}}^{[variant prime]} ) , tuple merging of t and ^{t [variant prime]} , denoted by t [composite function] ^{t [variant prime]} , is given by [figure omitted; refer to PDF] where each _{π ~ *} (or ^{π ~ * [variant prime]} ) is the adjusted value of _{π *} (or ^{π * [variant prime]} ) according to (8) and _{∪ F} denotes fuzzy union. For single-value tuples t = ( _π_{A 1} ) and ^{t [variant prime]} = ( ^π^{_{A 1}}^{[variant prime]} ) , tuple merging is alternatively denoted by _π_{A 1} [composite function] ^π^{_{A 1}}^{[variant prime]} .

Lemma 7.

Let _π_A and ^{π A [variant prime]} be two possibility distributions on the same domain. If _{π A} [congruent with] ^{π A [variant prime]} , then _{π A} [congruent with] _{π A} [composite function] ^{π A [variant prime]} [congruent with] ^{π A [variant prime]} .

Proof.

Based on (9) and (11), it is obvious that if _{ξ 5} ( _{π A} , ^{π A [variant prime]} ) = 1 , then _{ξ 5} ( _{π A} , _{π A} [composite function] ^{π A [variant prime]} ) = 1 and _{ξ 5} ( ^{π A [variant prime]} , _{π A} [composite function] ^{π A [variant prime]} ) = 1 .

Based on the literature review and Lemma 7, we summarize the property of different similarity measures with threshold α = 1 in Table 3 to show the merit of (9) adopted in this work.

Table 3: Property of different similarity measures with threshold α = 1 .

Properties\measures	Equation (5) [16]	Equation (6) [20]	Equation (9) [13]
Transitivity	Yes	No	Yes
Result after merging	Lossless	Non-lossless	Approximate lossless
Predicates for lossless join	Identical values	Different values	Different values

4. Approximate Lossless Join Decomposition

This section first offers the operations for relation decomposition and combination. Then, it proposes a notion of approximate lossless join decomposition (ALJD), which incorporates fuzzy concepts into lossless join decomposition. It also provides the method to achieve the ALJD.

Similar to the works in [37], this study generalizes the projection and natural join operations in traditional database to the fuzzy databases, as below. Here, given a relation R , Θ denotes a set of attributes in R (i.e., Θ ⊂ R ), and t [ Θ ] denotes the composite of values in tuple t over attribute Θ . For example, given t ∈ r ( R ) , t = ( _π_{A 1} , _π_{A 2} , ... , _π_{A m} ) , and Θ = ( _{A 2} , _{A 3} , _{A 4} ) , then t [ Θ ] = ( _π_{A 2} , _π_{A 3} , _π_{A 4} ) .

Projection . Projecting the instance of relation R on attributes Θ ⊂ R , denoted by _{Π Θ} ( R ) , is given by [figure omitted; refer to PDF]

Natural Join . Natural join instances of R ( X , Y ) and ^{R [variant prime]} ( Y , Z ) , denoted by R [ecedil]7; ^{R [variant prime]} , are defined as follows: [figure omitted; refer to PDF] In (12) and (13), tuple redundancy is determined by approximate equality (e.g., t [ Y ] [congruent with] ^{t [variant prime]} [ Y ] ; see Definition 2), and both redundancy removal and tuple combination use tuple merging in (11).

Proposition 8.

The projection result of a relation based on (12) must be unique.

Proof.

It can be directly derived from Proposition 5.

Based on the operations (12) and (13), the ALJD is formally defined following the extension of approximate equality from tuple level to relation level in Definition 9.

Definition 9 (approximately equal relation instances).

Two relation instances r ( R ) and ^{r [variant prime]} ( R ) in the fuzzy database are approximately equal, denoted by r ( R ) [approximate] ^{r [variant prime]} ( R ) , if for every tuple t ∈ r ( R ) , there must exist a tuple ^{t [variant prime]} ∈ ^{r [variant prime]} ( R ) such that t [congruent with] ^{t [variant prime]} and vice versa.

Definition 10 (approximate lossless join).

A composition { _{R 1} , _{R 2} , ... , _{R k} } of a relation R in the fuzzy database is approximate lossless join if r ( R ) [approximate] ( _Π_{R 1} ( R ) [ecedil]7; _Π_{R 2} ( R ) [ecedil]7; ... [ecedil]7; _{Π R k} ( R ) ) .

The approximate lossless join decomposition means the natural join of all decomposed results of a relation instance is approximately equal to the original relation instance. More specifically, every tuple in the original relation is approximately equal to one of tuples in the combination result.

Proposition 11.

Consider the following: [figure omitted; refer to PDF]

Proof.

It can be derived from (11) and (12).

Corollary 12.

Consider the following: [figure omitted; refer to PDF]

Proof.

It can be derived directly from Proposition 11.

The projection of a relation over the same schema, as shown in Corollary 12, represents no operations other than removing redundant tuples from the instance of the relation via tuple merging. Corollary 12 shows that the result of redundancy removal of a relation instance is approximately equal to the original instance. This property is essential for obtaining the combination result that is approximately equal to the original instance after relation decomposition.

This study proposes FFD for the decomposition in the fuzzy database as shown below.

Definition 13.

The FFD X ~ > Y holds in the relation instance r ( R ) if r ( R ) satisfies that, for every t , ^{t [variant prime]} ∈ r ( R ) ; if t [ X ] [congruent with] ^{t [variant prime]} [ X ] , then t [ Y ] [congruent with] ^{t [variant prime]} [ Y ] .

Remark 14.

An FD in a traditional database is a special case of the FFD. If a FD X [arrow right] Y holds in r ( R ) , then X ~ > Y holds as well. It is because t [ Θ ] [congruent with] ^{t [variant prime]} [ Θ ] must be true for any Θ ⊂ R if t [ Θ ] = ^{t [variant prime]} [ Θ ] .

Lemma 15.

Given relations R and ^{R [variant prime]} and r ( ^{R [variant prime]} ) [approximate] r ( R ) , if r ( R ) satisfies a set F of FFDs, then r ( ^{R [variant prime]} ) satisfies F .

Proof.

Proof by contradiction: we assumed that r ( ^{R [variant prime]} ) [approximate] r ( R ) and there exists an FFD X ~ > Y such that R satisfies X ~ > Y and ^{R [variant prime]} does not. Because X ~ > Y exists in r ( R ) , [figure omitted; refer to PDF] Since X ~ > Y does not exist in r ( ^{R [variant prime]} ) , there exists ^{t 1 [variant prime]} , ^{t 2 [variant prime]} ∈ r ( ^{R [variant prime]} ) , such that ^{t 1 [variant prime]} [ X ] [congruent with] ^{t 2 [variant prime]} [ X ] and η ( ^{t 1 [variant prime]} [ Y ] , ^{t 2 [variant prime]} [ Y ] ) ...0; 1 . Since r ( ^{R [variant prime]} ) [approximate] r ( R ) , there exists ^{t [variant prime][variant prime]} ∈ r ( R ) such that t [congruent with] ^{t 1 [variant prime]} and ^{t [variant prime][variant prime]} [congruent with] ^{t 2 [variant prime]} . Then, we have t , ^{t [variant prime][variant prime]} ∈ r ( R ) and t [ X ] [congruent with] ^{t [variant prime][variant prime]} [ X ] but η ( t [ Y ] , ^{t [variant prime][variant prime]} [ Y ] ) ...0; 1 , which contradicts (16).

It is noted that the FFD in Definition 13 satisfies Armstrong's axioms (inference rules), including reflexive rule, augmentation rule, and transitive rule ³ . This property enables the result of lossless join decomposition that has dependency preservation property ⁴ [1].

Lemma 16.

Let R ( A , X , Y ) be a relation and X ~ > Y be an FFD in r ( R ) . If _{R 1} = _{Π ( A , X )} ( R ) and _{R 2} = _{Π ( X , Y )} ( R ) , then the decomposition { _{R 1} , _{R 2} } of R has approximate lossless join property.

Proof.

We proved that r ( R ) [approximate] _Π_{R 1} ( R ) [ecedil]7; _Π_{R 2} ( R ) based on Definition 10. Let ^{R [variant prime]} be a relation such that r ( ^{R [variant prime]} ) = _Π_{R 1} ( R ) [ecedil]7; _Π_{R 2} ( R ) . We first prove that, for all t ∈ r ( R ) , there exist ^{t [variant prime]} ∈ r ( ^{R [variant prime]} ) and ^{t [variant prime]} [congruent with] t and then prove that, for all t ∈ r ( ^{R [variant prime]} ) , there must exist ^{t [variant prime]} ∈ r ( R ) and ^{t [variant prime]} [congruent with] t . Proof by contradiction: let us assume that t ∈ r ( R ) is the tuple such that no ^{t [variant prime]} ∈ r ( ^{R [variant prime]} ) satisfies ^{t [variant prime]} [congruent with] t . Let _{t 1} and _{t 2} be tuples such that _{t 1} = t [ A , X ] and _{t 2} = t [ X , Y ] . Based on Proposition 11, there must be _{t ^ 1} ∈ r ( _{R 1} ) and _{t ^ 2} ∈ r ( _{R 2} ) such that _{t ^ 1} [congruent with] _{t 1} and _{t ^ 2} [congruent with] _{t 2} because _{R 1} = _{Π ( A , X )} ( R ) and _{R 2} = _{Π ( X , Y )} ( R ) . Since r ( ^{R [variant prime]} ) [approximate] _Π_{R 1} ( R ) [ecedil]7; _Π_{R 2} ( R ) and _{t 1} [ X ] [congruent with] t [ X ] [congruent with] _{t 2} [ X ] , there must exist ^{t [variant prime]} ∈ r ( ^{R [variant prime]} ) such that ^{t [variant prime]} [congruent with] ( _{t 1} [ A ] , _{t 1} [ X ] [composite function] _{t 2} [ X ] , _{t 2} [ Y ] ) according to (13). Also, since _{t 1} = t [ A , X ] and _{t 2} = t [ X , Y ] , we have ( _{t 1} [ A ] , _{t 1} [ X ] [composite function] _{t 2} [ X ] , _{t 2} [ Y ] ) [congruent with] ( t [ A ] , t [ X ] , t [ Y ] ) by Lemma 7. Thus, ^{t [variant prime]} [congruent with] t , which contradicts the assumption.

Proof by contradiction for second part with renewed symbols: assume that ^{t [variant prime]} ∈ r ( ^{R [variant prime]} ) is the tuple such that no t ∈ r ( R ) satisfies ^{t [variant prime]} [congruent with] t . Since r ( ^{R [variant prime]} ) _Π_{R 1} ( R ) [ecedil]7; _Π_{R 2} ( R ) , there exist _{t 1} ∈ r ( _{R 1} ) and _{t 2} ∈ r ( _{R 2} ) such that _{t 1} [congruent with] ^{t [variant prime]} [ A , X ] , _{t 2} [congruent with] ^{t [variant prime]} [ X , Y ] , and _{t 1} [ X ] [congruent with] _{t 2} [ X ] based on (13). Also, we let _{t ^ 1} and _{t ^ 2} be the tuples such that [figure omitted; refer to PDF] Since _{R 1} = _{Π ( A , X )} ( R ) , based on Proposition 11, there must exist t ∈ r ( R ) such that [figure omitted; refer to PDF] Based on (17) and (18), we have t [ X ] [congruent with] ^{t [variant prime][variant prime]} [ X ] . Because X ~ > Y , t [ Y ] [congruent with] ^{t [variant prime] [variant prime]} [ Y ] holds based on Definition 13. Thus, t [ Y ] [congruent with] _{t ^ 2} [ Y ] [congruent with] ^{t [variant prime]} [ Y ] based on (19), and ^{t [variant prime]} [congruent with] t , which contradicts the assumption.

In Lemma 16, each one of A , X , and Y could be a single attribute or a set of attributes.

Definition 17.

Let F be the set of FFDs. An FFD f : X ~ > Y in F is trivial if there exists an FFD f ' : X ~ > ^{Y [variant prime]} in F such that Y ⊂ ^{Y [variant prime]} .

Based on Armstrong's inference rules, IR1 and IR3 (see endnote 2), if a set F of FFD contains X ~ > Y Z , the closure of F will also contain X ~ > Y and X ~ > Z , which is trivial. When a relation is decomposed into more relations, it takes more join operations to obtain the original data for query process. Considering the cost of join operations, it is not efficient to decompose a relation that has already been in the third normal form. A relation is in the third normal form if there is no functional dependency between nonkey attributes in the relation [1]. Accordingly, the relation decomposition has two prerequisites as follows.

(i) It needs to avoid decomposing a relation based on trivial FFDs.

(ii) It needs to make sure that the decomposed result preserves the closure of FFDs in the original relation.

For example, if X is not a key in r ( R ) , then R will be decomposed based on X ~ > Y Z rather than on trivial FFD X ~ > Y or X ~ > Z . Based on Lemma 16 and Definition 17, we propose an algorithm for ALJD (see Algorithm 1).

Algorithm 1: ALJD Algorithm.

Inputs R and F , where R is a relation and F is the set of FFDs exists in r ( R ) .

Step 1 . Let R = { R } and ^{F [variant prime]} be the set of all f ∈ F that are not trivial.

Step 2 . For a ^{f [variant prime]} : X ~> Y in ^{F [variant prime]} , Let ^{R [variant prime]} be the relation chosen from R such that both X and Y are in ^{R [variant prime]} .

Step 3 . If X is not a key attribute in ^{R [variant prime]} , do followings:

(1) decompose ^{R [variant prime]} into _{R 1} and _{R 2} , such that _{R 1} = _{Π ( X , Y )} ( ^{R [variant prime]} ) and _{R 2} = _Π^{_{R [variant prime]}}_{( X - )} ( ^{R [variant prime]} )

(2) let ^{F [variant prime]} = ^{F [variant prime]} - { ^{f [variant prime]} } (remove ^{f [variant prime]} from ^{F [variant prime]} )

(3) let R = R ∪ { _{R 1} , _{R 2} } - { ^{R [variant prime]} } .

Step 4 . Go to Step 2 if ^{F [variant prime]} is not empty.

Output is R , the set of relations decomposed from R .

Note: ^{R [variant prime]} ( X - ) represents the list of all attributes in ^{R [variant prime]} other than X .

In the ALJD algorithm, an FFD containing key attributes is excluded from the decomposition process at Step 3. This follows the concept of the normalization of traditional databases, where only the FD of nonkey attributes is considered. To have a consistent presentation of data, this work generalizes the definition of key attributes for the fuzzy databases; namely, an attribute A is a key attribute in R if there does not exist two tuples t and ^{t [variant prime]} in r ( R ) such that t [ A ] [congruent with] ^{t [variant prime]} [ A ] . The exclusion of processing FFDs containing key attributes can prevent unnecessary decomposing on the relations which have no update anomaly problem. Although the decomposition without the key exclusion is still an ALJD, it increases the cost of the join operations of query process.

Proposition 18.

Let R ( A , X , Y ) be a relation and let X ~ > Y be an FFD in r ( R ) . If _{R 1} = _{Π ( A , X )} ( R ) and _{R 2} = _{Π ( X , Y )} ( R ) , then (i) each FFD existing in r ( _{R 1} ) or r ( _{R 2} ) must exist in r ( R ) and (ii) every FFD existing in r ( R ) must either exist in r ( _{R 1} ) or r ( _{R 2} ) or be derived via FFDs in r ( _{R 1} ) and r ( _{R 2} ) .

Proof.

Statement (i) can be derived by Proposition 11 and Lemma 15. Statement (ii) can be derived by Lemma 16 and the property of FFD (namely, Armstrong's axiom IRs 1, 2, and 3, described at endnote 2).

The above statements show that the ALJD also preserves the closure of FFDs in the original relation, which is important to the issues related to the application of FFDs.

5. Conclusion

The contribution of this work is threefold. First, it highlights the problem of relation decomposition when tuple elimination is order sensitive. To overcome the problem, it proposes the notion of approximate equality for the tuples or relations in the fuzzy databases and provides the measure of the approximate equality. The measurement is reflexive, symmetric, and transitive. It enables classifying tuples into disjoint sets and ensures that a decomposed relation has unique result after redundancy removal or tuple merging. Therefore, the notion of approximate equality is important for data operations in the fuzzy databases. Second, it proposes approximate lossless join decomposition for the fuzzy databases and defines two operations projection and equal join for the decomposition, all of which are based on the approximate equality. The data operations and ALJD can be applied to the issue on data compression in the fuzzy databases. Third, this work defines FFDs and proposes an algorithm to decompose relations in the fuzzy databases based on the FFDs. The decomposition by the algorithm ensures the approximate lossless join property. The FFD and ALJD proposed for the fuzzy databases are, respectively, the general cases of the traditional FD and lossless join decomposition. The general property is important for dealing with the databases containing crisp data and fuzzy data. Forth, similar to the existing approaches of database normalization on resemblance-based fuzzy databases, this study provides several propositions to prove that the proposed approach of decomposition satisfies a degree of lossless join property. Compared to the normalization approaches for resemblance-based fuzzy databases, achieving lossless join decomposition for the extended possibility-based fuzzy databases is more difficult because of having more complex data.

There are some directions of future work. Future study can adopt the notion of approximate equality to define data operations for the query processing in the fuzzy databases. Research can apply the notion on the research related to data compression, fuzzy association rules, missing value prediction, relation compactness, and the integrity constraint in the fuzzy databases. Study aims to incorporate the fuzzy concept into clustering methods or data groupings for decision-making in marketing, healthcare applications, or business operations that can adopt the approximate equality for the similarity measures. Since the fuzzy concept has been incorporated into object-oriented databases in literature, future work can provide the approximate equality specifically for the data in the fuzzy object-oriented data models.

Acknowledgments

The author would like to acknowledge the support of National Science Council NSC102-2410-H-155-036-MY2 and Innovation Center for Big Data & Digital Convergence.

Endnotes

1.: Different orders on removing redundant tuples could lead to different results.

2.: For further details on possibility distribution and on the difference between possibility and probability measures, the reader is referred to [38].

3.: IR1 (reflexive rule): X [arrow right] Y if Y ⊆ X ; IR2 (augmentation rule): if X [arrow right] Y , then X Z [arrow right] Y Z ; IR3 (transitive rule): if X [arrow right] Y and Y [arrow right] Z , the X [arrow right] Z (see [1]).

4.: Each FFD in r ( R ) either directly exists in some individual relations that decomposed from R or can be represented via Armstrong's inference rules of the FFDs in these relations.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

References

[1] S. B. Navathe, R. Elmasri Fundamentals of Database Systems , Addison-Wesley, New York, NY, USA, 2010.

[2] P. A. Bernstein, J. R. Swenson, D. C. Tsichritzis, "A unified approach to functional dependencies and relations," in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 237-245, 1975.

[3] E. F. Codd, "A relational model of data for large shared data banks," Communications of the ACM , vol. 13, no. 6, pp. 377-387, 1970.

[4] L. A. Zadeh, "Fuzzy sets as a basis for a theory of possibility," Fuzzy Sets and Systems , vol. 1, no. 1, pp. 3-28, 1978.

[5] B. P. Buckles, F. E. Petry, "A fuzzy representation of data for relational databases," Fuzzy Sets and Systems , vol. 7, no. 3, pp. 213-226, 1982.

[6] S. Shenoi, A. Melton, "Proximity relations in the fuzzy relational database model," Fuzzy Sets and Systems , vol. 31, no. 3, pp. 285-296, 1989.

[7] G. Chen, J. Vandenbulcke, E. E. Kerre, "A general treatment of data redundancy in a fuzzy relational data model," Journal of the American Society for Information Science , vol. 43, no. 4, pp. 304-311, 1992.

[8] H. Prade, C. Testemale, "Generalizing database relational algebra for the treatment of incomplete or uncertain information and vague queries," Information Sciences , vol. 34, no. 2, pp. 115-143, 1984.

[9] J. C. Cubero, M. A. Vila, "New definition of fuzzy functional dependency in fuzzy relational databases," International Journal of Intelligent Systems , vol. 9, no. 5, pp. 441-448, 1994.

[10] K. Raju, A. K. Majumdar, "Fuzzy functional dependencies and lossless join decomposition of fuzzy relational database systems," ACM Transactions on Database Systems , vol. 13, no. 2, pp. 129-166, 1988.

[11] G. Chen, E. E. Kerre, J. Vandenbulcke, "Normalization based on fuzzy functional dependency in a fuzzy relational data model," Information Systems , vol. 21, no. 3, pp. 299-310, 1996.

[12] S. Jyothi, M. S. Babu, "Multivalued dependencies in fuzzy relational databases and lossless join decomposition," Fuzzy Sets and Systems , vol. 88, no. 3, pp. 315-332, 1997.

[13] J. Y. Liu, P. Chang, C. P. C. Yeh, "Consistent data operations for multi-databases in extended possibility-based data models," Expert Systems with Applications , vol. 36, no. 3, pp. 6174-6180, 2009.

[14] Z. M. Ma, W. J. Zhang, W. Y. Ma, F. Mili, "Data dependencies in extended possibility-based fuzzy relational databases," International Journal of Intelligent Systems , vol. 17, no. 3, pp. 321-332, 2002.

[15] J. Y. Liu, "Data integration constraints for consistent data redundancy in fuzzy databases," International Journal of Intelligent Systems , vol. 23, no. 6, pp. 635-653, 2008.

[16] G. Chen, J. Vandenbulcke, E. E. Kerre, "On the lossless-join decomposition of relation scheme(s) in a fuzzy relational data model," in Proceedings of the 2nd International Symposium on Uncertainty Modeling and Analysis, pp. 440-446, IEEE, College Park, Md, USA, April 1993.

[17] J. Zhang, G. Chen, X. Tang, "Extracting representative information to enhance flexible data queries," IEEE Transactions on Neural Networks and Learning Systems , vol. 23, no. 6, pp. 928-941, 2012.

[18] S. Shenoi, A. Melton, L. T. Fan, "An equivalence classes model of fuzzy relational databases," Fuzzy Sets and Systems , vol. 38, no. 2, pp. 153-170, 1990.

[19] M. Koyuncu, A. Yazici, "IFOOD: An intelligent fuzzy object-oriented database architecture," IEEE Transactions on Knowledge and Data Engineering , vol. 15, no. 5, pp. 1137-1154, 2003.

[20] Z. Ma, W. Zhang, W. Ma, "Semantic measure of fuzzy data in extended possibility-based fuzzy relational databases," International Journal of Intelligent Systems , vol. 15, no. 8, pp. 705-716, 2000.

[21] V. V. Cross, "Fuzzy extensions for relationships in a generalized object model," International Journal of Intelligent Systems , vol. 16, no. 7, pp. 843-861, 2001.

[22] V. V. Cross, "Defining fuzzy relationships in object models: abstraction and interpretation," Fuzzy Sets and Systems , vol. 140, no. 1, pp. 5-27, 2003.

[23] S. Wang, G. H. Huang, "A two-stage mixed-integer fuzzy programming with interval-valued membership functions approach for flood-diversion planning," Journal of Environmental Management , vol. 117, pp. 208-218, 2013.

[24] S. Wang, G. H. Huang, "An interval-parameter two-stage stochastic fuzzy program with type-2 membership functions: An application to water resources management," Stochastic Environmental Research and Risk Assessment , vol. 27, no. 6, pp. 1493-1506, 2013.

[25] S. Wang, G. H. Huang, "Interactive fuzzy boundary interval programming for air quality management under uncertainty," Water, Air, and Soil Pollution , vol. 224, article 1574, 2013.

[26] Z. M. Ma, F. Mili, "Handling fuzzy information in extended possibility-based fuzzy relational databases," International Journal of Intelligent Systems , vol. 17, no. 10, pp. 925-942, 2002.

[27] Z. M. Ma, L. Yan, "Updating extended possibility-based fuzzy relational databases," International Journal of Intelligent Systems , vol. 22, no. 3, pp. 237-258, 2007.

[28] P. Bosc, D. Dubois, H. Prade, "Fuzzy functional dependenciesan overview and a critical discussion," in Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 325-330, 1994.

[29] P. Bosc, O. Pivert, "On the impact of regular functional dependencies when moving to a possibilistic database framework," Fuzzy Sets and Systems , vol. 140, no. 1, pp. 207-227, 2003.

[30] E. A. Rundensteiner, L. W. Hawkes, W. Bandler, "On nearness measures in fuzzy relational data models," International Journal of Approximate Reasoning , vol. 3, no. 3, pp. 267-298, 1989.

[31] P. Bosc, D. Dubois, H. Prade, "Fuzzy functional dependencies and redundancy elimination," Journal of the American Society for Information Science , vol. 49, no. 3, pp. 217-235, 1998.

[32] Z. M. Ma, W. J. Zhang, F. Mili, "Fuzzy data compression based on data dependencies," International Journal of Intelligent Systems , vol. 17, no. 4, pp. 409-426, 2002.

[33] B. Bhuniya, P. Niyogi, "Lossless join property in fuzzy relational databases," Data and Knowledge Engineering , vol. 11, no. 2, pp. 109-124, 1993.

[34] Ö. Bahar, A. Yazici, "Normalization and lossless join decomposition of similarity-based fuzzy relational databases," International Journal of Intelligent Systems , vol. 19, no. 10, pp. 885-917, 2004.

[35] T. K. Bhattacharjee, A. K. Mazumdar, "Axiomatisation of fuzzy multivalued dependencies in a fuzzy relational data model," Fuzzy Sets and Systems , vol. 96, no. 3, pp. 343-352, 1998.

[36] Z. M. Ma, L. Yan, "Generalization of strategies for fuzzy query translation in classical relational databases," Information and Software Technology , vol. 49, no. 2, pp. 172-180, 2007.

[37] J. Zhang, Q. Wei, G. Chen, "An efficient incremental method for generating equivalence groups of search results in information retrieval and queries," Knowledge-Based Systems , vol. 32, pp. 91-100, 2012.

[38] D. Dubois, H. Prade Fuzzy Sets and Systems: Theory and Applications , vol. 144, of Mathematics in Science and Engineering, Academic Press, New York, NY, USA, 1980.

Word count: 9785

Show less

Copyright © 2014 Julie Yu-Chih Liu. Julie Yu-Chih Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Translate

Functional dependency is the basis of database normalization. Various types of fuzzy functional dependencies have been proposed for fuzzy relational database and applied to the process of database normalization. However, the problem of achieving lossless join decomposition occurs when employing the fuzzy functional dependencies to database normalization in an extended possibility-based fuzzy data models. To resolve the problem, this study defined fuzzy functional dependency based on a notion of approximate equality for extended possibility-based fuzzy relational databases. Examples show that the notion is more applicable than other similarity concept to the research related to the extended possibility-based data model. We provide a decomposition method of using the proposed fuzzy functional dependency for database normalization and prove the lossless join property of the decomposition method.

Details

Title

Lossless Join Decomposition for Extended Possibility-Based Fuzzy Relational Databases

Author

Julie Yu-Chih Liu

Publication year

2014

Publication date

2014

Publisher

John Wiley & Sons, Inc.

ISSN

1110757X

e-ISSN

16870042

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2014/842680

ProQuest document ID

1558489104

Lossless Join Decomposition for Extended Possibility-Based Fuzzy Relational Databases

Jump to:

Full Text

Abstract

Details

Suggested sources