2007 International Conference on Computational Intelligence and Security Workshops
Knowledge Mining for Web Business Intelligence platform and its sequence knowledge model 1 SHEN Jie Yangzhou University, Yangzhou City, Jiangsu Province, China,
[email protected]
WEI Liuhua Yangzhou University, Yangzhou City, Jiangsu Provinvce, China
[email protected]
HE Kun Yangzhou University, Yangzhou City, Jiangsu Provinvce, China
[email protected]
XU Fayan Yangzhou University, Yangzhou City, Jiangsu Province, China
[email protected]
BI Lei Yangzhou University, Yangzhou City, Jiangsu Provinvce, China
[email protected]
SUN Rongshuang Yangzhou University, Yangzhou City, Jiangsu Provinvce, China
[email protected]
Abstract The everchanging market information makes the traditional information collection and the way for using it unfitted for enterprises' business requirements. Knowledge Mining for Web Business Intelligence (KB4WBI) platform is put forward in this paper, and online Web knowledge acquisition and knowledge semantics management are realized. Since Web business information has evident time effectiveness and contextrelated characteristic, great emphasis is placed on the research of Web sequence knowledge representation model of ontology evolution. Compared to the current methods, this platform comprehensively considers the real time characteristic and semantic attributions of Web knowledge, improves the knowledge precision and utility, and lays the basis of Web business intelligence
source for people to get information. As far as enterprises are concerned, obtaining, spreading, processing and making it supportive to decision making by using internet effectively will improve the operation mode and management pattern of enterprises, assist the implementation of the links such as products sale, customersrelation management and supplychains management. The research and construction of Web knowledge management will help the enterprises pay close attention to the changes in the markets at home and abroad and grasp the market information quickly to adjust its strategy and tactics accordingly. At present, people involved in this area have fully realized the importance of Web information collection in business, and there has been much research work done on information collection and classification such as information retrieval and intelligence collection and so on. In face of the extremely rich business information of various trades that Web contains, current researches mainly dedicate to processing Web information and transforming it into valuable
1. Introduction Open Web environment has gradually evolved into a vast knowledge storehouse that contains all sorts of information, it becomes an important knowledge 1
Fund project: national nature science fund (60673060); natural science fund subsidy in Jiangsu province (BK2005046).
0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.137
156
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23,2011 at 08:14:37 UTC from IEEE Xplore. Restrictions apply.
therefore it achieves the goal of knowledge creation. Some American scholars such as Peter Drucker[1],Paul Strassmann[2] and Peter Senge[3] build a set of theories related to knowledge management. And Chris Argyris[4], Christoper Bartlett[5] and Dorothy LeonardBarton[6] in Harvard university, made significant advance in testing different management knowledge. Knapp,oppers and Lybrand[7] puts forward the “Big Six" knowledge management structure; there were “the knowledge management structure of Harvard” by the Harvard professors Hansam, Norhria and Tieerney, Arthur Anderse’s knowledge management structure, Microsoft’s structure of knowledge management and Earl structure. The aim of knowledge management is to create,accumulate and apply knowledge, so that new life and value would come out from it. But as Bill Gates warned us in his Digital Nerve: knowledge management is a tool rather than the aim. The aim of knowledge management is to improve the wisdom of the organization or the intelligence of enterprises. The concept of business intelligence (BI)[8], is coined by Gartner Group in 1996. Business intelligence is a solution that includes functions of information collection, merger, analysis and information access all in it, covering a wide spectrum of areas. Business intelligence is put forward on the basis of the informational management tools such as ERP and so on. It is an intelligent management tool, which is based on the construction of information technology. It can conduct all kinds of real time analysis of the data of the enterprises provided by the management tools such as ERP,CRM,SCM and so on, and produce reports to help the managers get a clear picture of the current situation of the enterprises and the market to make the right decision. In the aspect of the production realization of the business intelligence technology, software manufactures like IBM, ORACLE, Cognos, SAS, NCR, Brio and so on launched the relevant products by R&D or purchasing and acquisition. But the current BI system can only analyze and process the enterprises’ internal information, and the corresponding Web information mining technology is also only limited to the analysis of the modes of user visits, page links and the search of pages. It causes the poor use of the commercial information that is of potentially valuable. Some business systems of Web analysis function, SAS for example, which can only process data like the text information of the customers’ email, therefore, its use is also limited to the customer relation management (CRM).
knowledge in the area of specialty, providing more reliable guarantee for effective decision making management and doing decisive analysis by using large number of structural and nonstructural Web Information. However, because of the characteristics of Web environment such as being open and dynamic, the information loaded has comparatively strong time effectiveness and contextrelated characteristic. Therefore it is necessary for us to provide the means of Web righttime information acquisition and knowledge mining as well as a knowledge representation model that contains time attribute and context information. From the angle of knowledge application, formal semantic is the premise of the realization of the inference mechanism. The inference machine, based on the knowledge warehouse, makes the implied knowledge explicit. Webbased information acquisition and knowledge mining are the basic patterns of business information collection; semanticbased knowledge organization, including knowledge edition, knowledge processing, knowledge storage, knowledge organization and knowledge inference, is the fundamental guarantee for the maximization of knowledge inference, is the fundamental guarantee for the maximization of knowledge utility; the use of knowledge is the ultimate goal of knowledge management, which also realizes righttime intelligent ecommerce and decision support . The paper puts forward KB4WBI platform to address the problem that the current BI system can not fully exploit the potential valuable commercial information. The paper elaborates on the structure and the functional modules of the platform. The knowledge maintenance problem of KB4WBI platform is resolved in the aspect of time in this paper, which also ashers in the problem of ontology evolution. The specific scheme of evolution modelthe knowledge representation model basing on the concept of time and the related theories involving the concept of time is put forward.
2. The related work Knowledge management is by name people ' s systematic and effective management of the knowledge resources for the purpose of sharing knowledge,creating new knowledge and adding values to knowledge. The essence of knowledge management lies in the realization of sharing the implied and explicit knowledge and the interaction between them
157
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23,2011 at 08:14:37 UTC from IEEE Xplore. Restrictions apply.
communicating system(CCS). Description logics define sound semantics; it supports the design of ontology tests the consistencies of classes (especially for the complicated ontology or the ontology designed by many people); it supports the integration of the ontologyexpress the relations of the inner ontologies and build complete layers of the classes; it supports the exploitation of ontologytest whether the set of the truth is consistent with ontology and answer the inquiries concerning ontology. Calculus of communicating system (CCS) adds the dynamic knowledge into the knowledge base. The last module is the user module, including the following six submodules: knowledge design and recommendation submodule, expert system sub module, knowledge service submodule, casebased inference submodule, decision support system sub module and knowledge inference submodule, among which knowledge design and recommendation sub module is inferred on the basis of user module, and knowledge service submodule can achieve the objective of sharing network by simple object access protocol (SOAP) or universal description discovery and integration(UDDI). Finally, users can access decision support system submodule by certain interface. KB4WBI platform is dedicated to the maintenance problem of knowledge. For example, each item of knowledge has the time attribute. KB4WBI platform supports righttime service, which means it is always necessary to modify knowledge constantly. Take KB4WBI platform for example, when it is applied in various modes of data bases, in new tasks or domains, or applied when a kind of knowledge representation language is translated to another one. The ontology has to undergo a series of evolutions, ontology evolutions are precisely the selfadaptive alterations done to the various changes for the consistencies of ontology.
3. The basic structure of KB4WBI platform With the everchanging development of Web, the information on the Web develops rapidly with the internet, the data and information of Web pages also increases at an amazing speed. Here we put forward Knowledge Mining for Web Business Intelligence (KB4WBI) Platform, which is aimed at extracting useful knowledge or information more effectively from the huge information resource of Web pages to build a knowledge base, which will be submitted to the user after a series of analysis and processing. KB4WBI is composed of three big function modules: the Web data source module, the knowledge base module and the user module. Web data source includes text data, video, file data, data base data, Web page data, image data, file and Email data on the Web and so on. Knowledge base module is created from Web data source, which is processed by the Web Spider,Textual Analysis and MAS ,which is provided by the user module. And domain experts and knowledge engineers further create Tbox submodule within the knowledge base model. In fact, Tbox sub module is a combination of a certain number of terms, a collection of knowledge that is expressed on the layer of the semantic and knowledge of ontology. It describes the general attributes of concepts and relations. According to the inclusion relation between each other, a large number of concepts form the hierarchical structure of Tbox. Then Abox is the instantiation of Tbox (the process of filling Abox with values), a set of axioms that describe specific examples, including concept assertions and role assertions, Abox maps an individual into the concept in the Tbox. Such kind of mapping from an individual to a term is precisely the reflection of ontology explanation. We also handle the knowledge base module with description logics and ontology, calculus of
Figure 1. the basic structure of KB4WBI platform
158
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23,2011 at 08:14:37 UTC from IEEE Xplore. Restrictions apply.
this reason, therefore, forms the version relation between the concepts and the definitions of properties in the primitive and new ontology versions. We denote the alteration of ontology as tuple , among which, name is the alteration name of ontology, is the alteration parameter of ontology, is the combination of ontology concept, attribute, relation and relation type, . is the premise of the execution of the ontology alteration and only if this condition is met, ontology alteration can be executed. We describe ontology in the way of class, therefore, the specific implementation details can be covered and the development of ontology evolution is promoted. Then the relations between ontology concepts can be transformed into the ones between classes. In the ontology management system, the concept of one ontology depends on the concept in another ontology. That is to say, the concept of one ontology have two evident relations with the concept defined in another ontology, namely the inheritance relation and the reference one. Ontology evolution is labeled by time. The addition of evolution attributes is denoted by the time property of the concept. The open and dynamic characteristics of Web environment determine the time effectiveness and contextrelated characteristic of Web knowledge. KB4WBI adopts ontology as semantic support and the context attribute can be incorporated into the definition of concept and handled as the general attribute of the definition of
4.Dynamic Knowledge Representation Model Basing On Ontology Evolution The ontology management of KB4WBI platform is a hierarchical management system, the property of monotonicity plays a major role in the whole ontology management system, which is the most primitive and fundamental property. First of all, the versions of the ontologies are divided into two parts: the main version and the minor ones, the main version of ontology is corresponding to a conceptual ontology, while the minor ones are corresponding to the further purifying and improving work of the conceptual ontology. So the development of ontology can start form the main version of ontology and end with the minor ones. In fact, such kind of main/minor model utilizes the property of monotonicity. It guarantees that the readymade properties can not be moved and support the gradual evolution of ontology conceptualization. And we use metaontology to store the differences between the main ontology and the minor ontologies and track the evolution of ontologies. So we denote O as the main ontology, denote the sequence of the evolution requests (abbreviated as ER) as {R1,R2 ……Rn}, denote the sequence of the evolution functions (abbreviated as EF) as {F1(O),F2(O)……Fn ( O ) }, denote the sequence of metaontologies (abbreviated as MO) as {Metaonto1,Metaonto2…… Metaonton}, denote the sequence of the minor ontologies as { O1' ,O2' ……On' }, as a result, the evolution course is demonstrated as figure 2 below:
concept. Therefore, the general form of any Term i can be improved as below: Term i º t j Ù A 1 Ù A 2 Ù L Ù A n Ù "R 1 . C 1 Ù "R 2 . C 2 ÙL"R m . C m
A 1 , L A n and C 1 , L C m are atomic concepts, R 1 , L R m is the atomic role, t i is one
among which,
moment in the onedimensional time, and this constitutes the knowledge representation model with the time label. Introduce the atomic concept time (t1) and two membered relation before(t1,t2) Define the concepts relevant to the time concept as below: BeforeEq(t1,t2) time(t1) and time(t2) and (before(t1,t2)or t1=t2) TimeBetween(t,t2,t3) time(t1) and time(t2) and time(t) and Before(t1,t) and Before(t,t2) TimeBetweenEq(t,t2,t3) time(t1) and time(t2) and time(t) and BeforeEq(t1,t) and BeforeEq(t,t2) Introduce the axioms of time as below: Axiom 1 before(t1,t2) => time(t1) and time(t2) Axiom 2 time(t1) and time(t2)=>t1=t2 or before(t1,t2) or before(t2,t1) Axiom 3 before(t1,t2) and true => false
Figure 2. ontology evolution As a general rule, ontology includes a set of classes or definitions of concepts, the definitions of properties and the related axioms. Classes, properties and axioms interrelate with each other and form the model of part of the world. A certain change will constitute a new version of ontology, for
159
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23,2011 at 08:14:37 UTC from IEEE Xplore. Restrictions apply.
Product Development [J ] . Strategic Management Journal , 1992(12) :111~125
Axiom4 before(t1,t2)and before(t2,t3)=>before(t1,t3) The definitions of concepts and axioms, based on the description logics, fix the basis of the formal inference. Formal inference machine can be built on the definitions of concepts and axioms. It makes the implied knowledge explicit, mainly by the test of satisfability and tableau algorithm, and achieves the test of knowledge inconsistence.
[7] Coopers & Lybrand. The reinvention of the corporate information model. Professional Communication, IEEE ,2002.VOL39,NO1, March 1996. [8] Daniel S. Soper. A Framework for Automated Web Business Intelligence Systems. in: Proceedings of the 38th Hawaii International Conference on System Sciences – IEEE,2005.
5. Conclusion The traditional information collection and the methods of using it can not meet the requirements of the enterprise business. To resolve this problem, Knowledge Mining for Web Business Intelligence (KB4WBI) platform is put forward in this paper, mainly emphasizing on two problems, online Web knowledge acquisition and management of semantic knowledge. Meanwhile, because of the evident time effectiveness and contextrelated characteristics of the Web commercial information, Web knowledge representation model, based on the ontology evolution, is put forward in the paper. The atomic concepts,relations and axioms of the time ontology are also set in the paper. The following work will focus on the construction of the formal inference machine and making knowledge explicit in data base as well as the test of the consistency of knowledge with the time concept.
6. Reference [1] Peter Drucker.The Essential Drucker:The Best of Sixty Years of Peter Drucker's Essential Writings on Management[M].Harper Business,2003. [2] Paul Strassmann.What's the key to implementing knowledge management. Knowledge Management Magazine, April 1999. [3] Senge, Peter M. 1990b, The Leader's New Work: Building Learning Organizations [J].Sloan Management Review,Vol.32, Fall 1990(1):pp7~23 [4] Chris Argyris.Teaching Smart People How to Learn [J] . Hazard Business Review , 199l , (May — June):99—109. [5] Ning Gu, Guowen Wu,.et.al.Extracting Web Table Information in Cooperative Learning Activities Based on Abstract Semantic Model[J]. Proceedings of the Sixth International Conference on Computer Supported Cooperative Work in Design July 1214,2001 Page:492 497 [6] Dorothy Leonard Barton. Core Capabilities and Core Rigidities : A Paradox in Managing New
160
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23,2011 at 08:14:37 UTC from IEEE Xplore. Restrictions apply.