In some areas, we expect machinery to be able to simulate behavior,
reasoning ability like human and give human reliable suggestions in the
decision-making process. A prominent feature of human is the ability to reason
on the basis of knowledge formed from life and expressed in natural language.
Because the language characteristic is fuzzy, the first problem that needs to be
solved is how to mathematically formalize the problems of linguistic semantic
and handle semantic language that human often uses in daily life.
In response to those requirements, in 1965, Lotfi A. Zadeh was the first
person to lay the foundation for fuzzy set theory. Based on fuzzy set theory,
Fuzzy Rule Based System (FRBS) has been developed and become one of the
tools of simulating reasoning method and making decisions of human in the
most closely manner. FRBS has been successfully applied in solving practical
problems such as control problem, classification problem, regression problem,
language extraction problem, etc.
When building FRBSs, we need to achieve two goals: accuracy and
interpretability. The thesis will focus on the study of interpretability.
In [1]1 Gacto finds that there are currently two main approaches to
interpretability. The first approach is based on complexity and the second
approach is based on semantics. Another approach proposed by Mencar et. al. in
[2]2, called similar measure function-based approach to assess the
interpretability of semantics-based fuzzy rules. The interpretability of fuzzy
rules is measured by the similarity between knowledge represented by fuzzy set
expression and linguistic expression in natural language.
26 trang |
Chia sẻ: thientruc20 | Lượt xem: 521 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Study of real-World semantics-based interpretability of fuzzy system, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
GRADUATE UNIVERSITY OF SCIENCE AND TECHNOLOGY
VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY
NGUYEN THU ANH
Study of real-world semantics-based
interpretability of fuzzy system
Major: MATHEMATICAL BASIS FOR INFORMATICS
Code: 62.46.01.10
SUMMARY OF MATHEMATICS DOCTORAL THESIS
SCIENTIFIC INSTRUCTOR:
Ph.D. Tran Thai Son
Hanoi 2019
1
INTRODUCTION
In some areas, we expect machinery to be able to simulate behavior,
reasoning ability like human and give human reliable suggestions in the
decision-making process. A prominent feature of human is the ability to reason
on the basis of knowledge formed from life and expressed in natural language.
Because the language characteristic is fuzzy, the first problem that needs to be
solved is how to mathematically formalize the problems of linguistic semantic
and handle semantic language that human often uses in daily life.
In response to those requirements, in 1965, Lotfi A. Zadeh was the first
person to lay the foundation for fuzzy set theory. Based on fuzzy set theory,
Fuzzy Rule Based System (FRBS) has been developed and become one of the
tools of simulating reasoning method and making decisions of human in the
most closely manner. FRBS has been successfully applied in solving practical
problems such as control problem, classification problem, regression problem,
language extraction problem, etc...
When building FRBSs, we need to achieve two goals: accuracy and
interpretability. The thesis will focus on the study of interpretability.
In [1]1 Gacto finds that there are currently two main approaches to
interpretability. The first approach is based on complexity and the second
approach is based on semantics. Another approach proposed by Mencar et. al. in
[2]2, called similar measure function-based approach to assess the
interpretability of semantics-based fuzzy rules. The interpretability of fuzzy
rules is measured by the similarity between knowledge represented by fuzzy set
expression and linguistic expression in natural language.
In 2017, a new approach to the interpretability of fuzzy system is Real-
world-semantics-based approach – RWS-approach, has been first-time proposed
and initially surveyed in [3]3. This approach is based on real-world semantics of
words and relations between semantics of fuzzy system components and
corresponding component structures in the real world.
Derived from the recognition that fuzzy set expressions, especially fuzzy
rules of fuzzy systems have no relationship based on methodology with real
world semantics and, therefore, there are no formal basis to study the nature of
interpretability, his thesis chooses the real-world-semantics-based approach
proposed in [3] to study the interpretability of fuzzy systems.
1 M.J. Gacto, R. Alcalá, F. Herrera (2011), Interpretability of Linguistic Fuzzy Rule-Based
Systems: An Overview of Interpretability Measures. Inform. Sci., 181:20 pp. 4340–4360.
2 C. Mencar, C. Castiello, R. Cannone, A.M. Fanelli (2011), Interpretability assessment of fuzzy
knowledge bases: a cointension based approach, Int. J. Approx. Reason. 52 pp. 501–518.
3 Cat Ho Nguyen, Jose M. Alonso (2017), “Looking for a real-world-semantics-based approach to
the interpretability of fuzzy systems”. FUZZ-IEEE 2017 Technical Program Committee and
Technical Chairs, Italy, July 9-12.
2
At the same time, at present, methods of building FRBS from data in
fuzzy set theory-based approach lack a full formal link between fuzzy sets
representing the assumed semantics of a word and its inherent semantics. The
words used in FRBS are only considered as labels or symbols assigned to
corresponding fuzzy sets, are very difficult to fully convey underlying semantics
compared with natural linguistic words. Therefore, this thesis wishes to further
study the interpretability of linguistic fuzzy systems in the semantic approach
based on the hedge algebra proposed by Nguyen and Wechler [4]4 [5]5. In this
approach, the computational semantics of words shall be defined based on the
inherent order semantics of the words and word domains of the variables that
establish an order-based structure that are rich enough to solve the problems in
fact.
This thesis has achieved some following results:
Research and analysis of interpretability are as a study of the relationship
between RWS of linguistic expressions and computational semantics of
computational expressions assigned to linguistic expressions. The schema
proposal solves the problem of interpretability of the computational
representation of liguistic frame of cognitive (LFoC).
The study proposing constraints on interpretation operations is built to
convey, preserve the desired semantic aspects of the LFoC for fuzzy systems.
Application of HA approach solves the problem of interpretability of
computional representation of LFoC by establishing a granular polymorphism
structure of triangular fuzzy sets or trapezoidal fuzzy sets.
Further clarify RWS interpretation of human natural languages and word
domains of variables and its basic role in checking RWS interpretability of
components of fuzzy system, at the same time, prove that the standard fuzzy set
algebras are not RWS interpretability.
Propose formalization method to solve RWS interpretation of fuzzy
systems in the second case and n input variable.
CHAPTER I : BASIC KNOWLEDGE
1.1 Fuzzy set
Definition 1.1. [6]6 Let U be the universe of objects. The fuzzy set A on U is
the set of ordered pairs (x, A(x)), with A(x) being the function from U to [0,1]
4 C.H. Nguyen and W. Wechler (1990), “Hedge algebras: an algebraic approach to structures of sets
of linguistic domains of linguistic truth variables”, Fuzzy Sets and Systems, vol 35, no.3, pp. 281-
293.
5 Cat-Ho Nguyen and W. Wechler (1992),” Extended hedge algebras and their application to Fuzzy
logic”, Fuzzy Sets and Systems, 52, 259-281.
6 L. A. Zadeh, Fuzzy set, Information and control, 8, (1965), pp. 338-353
3
assigned to each element x of U value A(x) reflects the degree of x belong to
fuzzy set A.
If A(x) = 0, then we say x does not belong to A, otherwise if A(x) = 1, then
we say x belongs to A. In Definition 1.1, function is also called is a
membership function.
1.2 Linguistic variable
Simply as said by Zadeh, a linguistic variable is a variable in which "its
values are words or sentences in natural language or artificial language".
1.3 Fuzzy rule based system
1.3.1. The components of the fuzzy system
A fuzzy rule based system consists of the following main components:
Database, Fuzzy Rule-based - FRB and Inference System.
- Database is sets of 𝔏j including linguistic label Tj corresponding to fuzzy
sets used to reference domain fuzzy partition UjR (real number set) of variable
𝔛j, (j=1,..,n+1) of problem n input 1 output.
- Fuzzy rule base is a set of fuzzy rules if-then.
- Reasoning system performs an approximate reasoning based on rules and
input values to produce the predicted output value. Some approximate reasoning
directions are as follows:
+ Approximate reasoning based on fuzzy relationship
+ Approximate reasoning by linear interpolation on fuzzy set
+ Reasoning based on the rule burning
1.3.2. Objectives upon building FRBS
Evaluation of the effectiveness (accuracy) of FRBS
For the objective of the effectiveness of FRBS, we have mathematical
formulas to evaluate how an FRBS is effective.
Problem of interpretability of FRBS
Interpretability is a complex and abstract problem, it involves many factors.
In [1] Gacto finds that there are currently two main approaches to the
interpretability:
- Interpretability is based on complexity:
Rule basis level: The less the number of rules of the rule system is, the
shorter the length of the rule is.
Fuzzy partition level: number of attributes or number of variables,
number of variables used less will increase the interpretability of the rule
system. The number of functions is used in the fuzzy partition, the number of
functions should not be exceeded 7±2 [6].
- Interpretability is based on semantics:
Semantics at the rule basis level: The rule basis must be consistent, ie. it
does not contain contradictory rules, the rules with the same premise must have
4
the same conclusion, the number of rules burned by an input data is as little as
possible.
Semantics at fuzzy partition level (word level): The defined domain of
variables must be completely covered by the function of fuzzy sets.
1.4 Hedge algebra
1.4.1. The concept of hedge algebra
Definition 1.2 [7]7: A hedge algebra is denoted as a set of 4 components
denoted by AX = (X, G, H, ) where G is a set of generator, H is a set of hedges,
and “” is a partial ordering relation on X. The assumption in G contains
constants 0, 1, W with the meaning of the smallest element, the largest element
and the neutral element in X. We call each language value xX a term in HA.
If X and H be linearly ordered sets, then AX = (X, G, H, ) is sais a linear
hedge algebra. And if two critical hedges are fitted and with semantics being
the right upper bound and right lower bound of the set H(x) when acting on x,
then we get the complete linear HA, denoted by AX* = (X, G, H, , , ). Note
that hn...h1u is called a canonical representation of a term x for u if x = hn...h1u
and hi...h1uhi-1...h1u for i is integer and in. We call the length of a term x is the
number of hedges in its canonical representation for the generated element plus
1, denoted by l(x).
1.4.2. Some properties of linear hedge algebra
Theorem 1.1: [7] Let the sets H- và H+ of a hedge algebra AX = (X, G, H,
) be linearly ordered. Then, the following statements hold:
i) For every uX, H(u) is a linearly ordered set.
ii) If X is a primarily generated hedge algebra and the set G of the primary
generators of X is linearly ordered, then so is the set H(G). Furthermore, if u<v,
and u, v are independent, i.e. uH(v) và vH(u), thì H(u) H(v).
The theorem below looks at the comparison of two terms in the linguistic
domain of variable X
Theorem 1.2: [7] Let x = hnh1u and y = kmk1u be two arbitrary
canonical representations of x and y w.r.t. u. Then there exists an index j ≤
min{n, m} + 1 such that hj' = kj' for all j'<j (here if j = min {n, m} + 1 then either
hj = I, hj is the unit operator I, for j = n + 1 ≤ m or kj = I for j = m + 1 ≤ n) and
i) x<y iff hjxj<kjxj, where xj = hj-1...h1u.
ii) x = y iff m = n and hjxj = kjxj.
iii) x and y are not comparable iff hjxj and kjxj are not comparable.
7 C. H. Nguyen and V. L. Nguyen (2007), Fuzziness measure on complete hedges algebras and
quantifying semantics of terms in linear hedge algebras, Fuzzy Sets and Syst., vol.158 pp.452-471.
5
1.4.3. Fuzziness measure of linguistic values
Definition 1.3: [7] Let AX *= (X, G, H, , , ) be a linear ComHA. An
fm: X [0,1] is said to be an fuzziness measure of terms in X provided:
(i) fm is complete, i.e. fm(c-) + fm(c+) =1 và hHfm(hu) = fm(u), uX;
(ii) fm(x) = 0, for all x such that H(x) = {x} and fm(0) = fm(W) = fm(1) = 0;
(iii) x,y X, h H,
)(
)(
)(
)(
yfm
hyfm
xfm
hxfm
, that is this propotion does not
depend on particular elements and, hences, is called the fuzziness measure of
hedge h and is denoted by (h)
We summarize some properties of the fuzziness measure of linguistic term
and hedges in the following proposition:
Proposition 1.1: [7] Let fm và be defined in Definition 1.3, then:
(i) fm(c-) + fm(c+) = 1 and ( ) ( )
h H
fm hx fm x
;
(ii)
1
)(
qj j
h ,
p
j j
h
1
)( , for ,> 0 and + = 1;
(iii) kXx xfm 1)(
, where Xk is the set of all term in X = H(G) of length k;
(iv) fm(hx) = (h).fm(x), and xX, fm(x) = fm(x) = 0;
(v) Given fm(c-), fm(c+) and (h), hH, the for x = hn...h1c, c {c-, c+},
one can easily comput fm(x) như sau: fm(x) = (hn)...(h1)fm(c).
1.4.4. Fuzziness interval
Definition 1.4 [7]: Fuzziness interval of terms xX, denoted by fm(x), is a
subset of paragraph [0, 1], fm(x) Itv([0, 1]), has the length equal to the fuzzy
measure, |fm(x)| = fm(x).
1.4.5. Quantifying semantics of linguistic values.
Definition 1.5 [7]: Let AX*= (X, G, H, ) be a linear HA, we define:
1) Function sign(k, h) ∈ {-1, 1} is said to be relative sign function of k for h
if sign(k, h) = 1((x≤ hx) hx ≤ khx)(x≥hx) hx≥khx)), and
sign(k, h) = -1 ((x ≤ hx) hx≥ khx ≥ x) (x ≥ hx) hx≤ khx≤ x))
2) Function Sign: X {-1, 0, 1} is said to be sign function of words x if hn
h1c, c∈G, is a formal representation, i.e. hjhj-1 h1c ≠ hj-1 h1c, for every j
= 1, , n and h0 = Id, identity, i.e. h0c = c, then:
Sign(x)=Sign(hnhn-1h1c) = sign(hn,hn-1) × × sign(h2,h1) × sign(h1)
×sign(c).
Based on the sign function definition, we have the standard to compare hx
and x.
Proposition 1.2 [7]. For any h and x, if Sign(hx) = +1 then hx>x; if Sign(hx)
= -1 then hx<x and if Sign(hx) = 0 then hx = x.
From the above proposition we have:
0≤ H(x) ≤ 1 and H(x) ≤ H(y), x, y, i.e. xH(x) and yH(y) (1.2)
Sgn(hpx) = +1 H(h-qx) ≤≤ H(h-1x) ≤ x ≤ H(h1x) ≤≤ H(hpx) (1.3)
6
Sgn(hpx) = 1 H(h-qx) ≥ ≥ H(h-1x) ≥ x ≥ H(h1x) ≥≥ H(hpx) (1.4)
Definition 1.6 [7]: Let AX be a free linear ComHA and fm be a fuzziness
measure on X . Then, a mapping : X [0, 1] is said to be included by fm ,if it
is defined recursively as follows:
(i) (W)= =fm(c-), (c-)=– fm(c-) = .fm(c-), (c+) = +fm(c+);
(ii) (hjx)= (x)+
)(
)(
)()()()()()(
jsigni
jsigni
xfmx
j
hx
j
hxfm
i
hx
j
hSign , (1.5)
for j, –qjp và j 0,
,))(()(1
2
1
)( xhhSignxhSignxh
jpjj
;
With this definition, it has been proven that it satisfies the requirements of a
semantic quantitative function and assures its discretion with the word classes of
AX in paragraph [0, 1].
1.5 Conclusion of chapter 1
In this chapter, we summarizes the basic knowledge that serves as a basis
for research. It includes fuzzy set theory, fuzzy system based on rules,
applications, theory of HA.
CHAPTER 2. INTERPRETABILITY OF LINGUISTIC COGNITIVE
FRAMEWORK IN LINGUISTIC FUZZY SYSTEMS
In this chapter, we will show the schema that solves the interpretability
problem of the computational representation of the linguistic cognitive
framework, propose additional semantic constraints on interpretative maps. The
next section will survey the representation of the granular polymorphism
structure generated from the semantics of the word domain and show that these
representions meet the relevant constraints. The results of this chapter are
presented based on the work [2] in the List of scientific works of the author
related to the thesis.
2.1. The interpretability of LRBSs on the word level
Nguyen and colleagues [8]8, proposed a new approach to the interpretability
of LRBSs which leads to the investigation of the order-based semantics of the
LRBS components. The basis of the new approach is that the word-domain of a
variable 𝒳, denoted by Dom(𝒳), is modeled by an order-based structure induced
by the inherent meaning of the word, called hedge algebras(HAs).
8 C.H. Nguyen, V.Th. Hoang, V.L. Nguyen (2015), “A discussion on interpretability of linguistic
rule base systems and its application to solve regression problems”, Knowledge-Based Syst., vol. 88,
pp. 107-133.
7
The essence of computational interpretation is that the interpretation of the
semantics of words which cannot be calculated, needs to be converted to
computable objects, but the transformation must "preserve the semantics" of the
words. This requires us to investigate to propose the necessary constraints on
semantic interpretation.
We use the concept of LFoCs of variables, interpreting as word
vocabularies used to describe real world entities. So, the study of the
interpretability of a comput-representation of an LFoC is just to examine how
much semantic information of the words of the LFoC a desired interpretation
can convey or represent.
2.1.1. Scheme to solve the problem of interpretability of calculation
representation of linguistic frame of cognitive
In the study, for easily understandable we first schematize the process of
solving the interpretability of the comput-representation of the LFoCs of
LRBSs, as represented in Fig. 2.1, in which I1 is an interpretation assigning an
appropriate HA-element of 𝒜𝒳 to every word and I2 assigns an object of a
comput-structure 𝔖 to an HA-element of AX.
2.1.2. General constraints on the computational interpretation of the
words of variables
The authors in [8] proposed the initial constraints applied to the
interpretations described in Figure 2.1 for linguistic frame of cognitive LFoC to
maintain the semantics of LFoCs in the context of the entire word domain
instead of constraints imposed only on fuzzy sets.
Constraint 2.1 [8] (Essential role of the word): The inherent semantics of
words of a variable appearing in a f-rule base (FRB) must, in principle, be
explicit-ly taken into account or, must create a formalized basis to determine the
comput-semantics of the words, including the fuzzy set based semantics, to
handle the comput-semantics of the FRB.
Figure 2.1. A schema of a computational interpretation I of an LFoC
oC
Syntactical expressions of
an LFoC and its formal
properties
The low level (word level):
- - Words (syntactical strings)
- - Formalized LFoC (a set of
formalized words) and their
relationship structure
(semantic order-based
relation of words,
generality-specificity
relation etc.)
The HA AX modeling the
word-domain D
containing the LFoC
The HA of the word-
domain:
- - HA-expressions: string
representations of words
in D
- LFoCs and their
relationship structure
The desired
computational objects
of a comput.
math. structure
Comput. structure:
(number, fuzzy set,
interval, ...)
-The objects of
comput. structure CS
and the relationships
between them.
-Set of comput-objects
representing LFoC
I2 I1
I = I2 o
I
1
8
Constraint 2.2 [8] (Formalization of word quantification): The comput-
semantics of words, including f-sets semantics, should be produced based on an
adequate formal formalization of the word-domains of variables. Moreover, they
can be produced by a procedure developed based on this formalization system
that can then perform computational semantics of words automatically.
Constraint 2.3 [8] (Interval-interpretation of the words and G-S relation):
Let be given variable 𝒳, whose word-domain is Dom(𝒳), and denote by Intv the
set of all intervals of U(𝒳), an interval-interpretation 𝒜: Dom(𝒳) → Intv,
declared to be an interval-semantics of 𝒳, should preserve the G-S relationships
between the words, i.e. for any two words x and hx of 𝒳, where h is a hedge, we
should have 𝒜 (hx) 𝒜 (x).
Constraint 2.4 [8] (Interpretation as order isomorphism): To study the
order-based semantics of ling-rules, the comput-interpretation of words of 𝒳, ℑ:
Dom(𝒳) → C(𝒳), must preserve the word semantics, i.e.x,yDom(𝒳), xy &
x≤ y ℑ(x) ℑ(y) & ℑ(x)≼ ℑ(y), where ≼ is an order-relation on ℑ(Dom(𝒳)).
That is, ℑ should be an order isomorphism.
2.1.3. Additional constraints on the computational representations of
linguistic frames of cognition
To study the LRBS interpretability at the low level, we propose the
following additional constraint on semantic core of the words of the LFoCs used
for the designed LRBSs.
Definition 2.1. An LFoC 𝔉 of a variable 𝒳 (in a u