A Light-Weight Web Application Model Based on Semantic Wiki/version

From Tetherless World Wiki

Jump to: navigation, search

A Light-Weight Web Application Model Based on Semantic Wiki (Working Draft May 17, 2009 21:08) Jie Bao, Li Ding, Rui Huang Tetherless World Constellation Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180, USA. fbaojie,dingl,huangr3g @cs.rpi.edu Paul R. Smart School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, United Kingdom. ps02v@ecs.soton.ac.uk Dave Braines, Gareth Jones Emerging Technology Services, IBM United Kingdom Ltd, Hursley Park, Winchester, Hampshire, SO21 2JN, United Kingdom. dave_braines@uk.ibm.com garethj@uk.ibm.com Abstract—Wiki systems are made popular by their strengths in enabling users to collectively and easily update an information repository stored as wiki pages. The recent development of semantic wikis, brings, in addition to the easy data publishing ability of the conventional wikis, the ability to add and query some simple semantic annotations (e.g., categories and typed links). This allows semantic wikis being ideal for performing some light-weight data modeling, processing and presentation tasks. We argue that these merits enable semantic wikis to leverage a novel client-transparent, light-weight web application development model wherein users can cooperatively contribute to the representation, manipulation and visualization of shared information. In this paper, we describe a generic framework for such a web application model, and present two proof-of-concept prototypes, namely RPI Map and CNL (Controlled Natural Language) Wiki, based on Semantic MediaWiki (SMW). Index Terms—Semantic Wiki, Semantic Web, Web Application Model I. INTRODUCTION The success of social web applications (often called “web 2.0”1) is exhibited by the active and massive content publishing on the Web by networked Web users. Unlike the first generation of web applications where contents are often static and solely provided by the owners of websites, social web applications share a few characteristics that enable an easier yet also more powerful data publishing process, such as: ² Simple publishing: a user can publish content in the browser via some simple interfaces like forms or WYSWYG interface. A user is not required to have advanced knowledge in server configuration or a content publishing language (e.g., HTML). ² Social interaction: content publishing is no longer one person’s own business: users can collaboratively compose the same Wikipedia article, or users can update their status and opinions with their friends on, e.g. Twitter 1”Web 2.0” may also refers to other aspects of the second generation of web development and design. In this paper, we use it to refer the social aspects of web applications. and Facebook. The Web based content publishing is promoted by the network effect where the value of a service provided by a user rockets when more people are benefiting from the service [9]. In particular, wiki systems have been proven a family of highly successful social web applications. “The Wiki Way”, as described in [14], centers on a philosophy that content (as wiki pages) should be collaboratively written using some simple markup languages in Web browsers, and an ongoing process of creation and collaboration is expected to improve the quality and quantity of the content. Prominent wiki sites like Wikipedia exhibits the characteristics of wikis as an ideal form of content management system. Social web applications (and in particular wikis) often support associating some simple metadata to user-submitted content, such as tagging and RSS feed. Those metadata has been shown useful in many circumstances. However, generally social web applications does not provide means to preserve sophisticated structure or semantics of the content. That leads to a bottleneck in building a web ecosystem of data: the success of data publishing on the social web does not lend itself to matched means for data consumption (e.g., to search, customize, filter, reuse data and to infer new facts from known facts) on the social web. In a hope to overcome such limitations, there have been increasing recent efforts in enhancing social web applications with semantic web technologies. Those efforts are aimed at combining strengths of these two approaches: the data flexibility and portability offered by the semantic web, and the scalability and authorship advantages of the social web. In particular, this emerging trend ushers in semantic wiki systems, e.g., Semantic Mediawiki (SMW) [12], IkeWiki [16] and SweetWiki [4], that extend conventional wikis with the ability to add simple semantic annotations to wiki contents, such as categories and typed links. As a consequence, semantic wikis enable contents in a wiki to be structured and can be accessed using some simple query languages. Server Side Server Side Users’ Control Users’ Control Web Server Database/Files Web Browser Web Browser Ajax Engine Web/Data Server Database/Files Conventional Model AJAX Model Server Side SemWiki Engine Users’ Control Web Browser SemWiki Data Wiki Func. SemWiki-based Model Web Server Database/Files Wiki UI Server Side Wiki Engine Users’ Control Web Browser Wiki Func. Wiki-based Model Web Server Database/Files Wiki UI Fig. 1. A comparison of several Web application models The ability for annotating and querying data of semantic wikis allows them to achieve better means for both data publishing and data consumption. In particular, semantic wikis is capable of preforming some light-weight data modeling, computation and presentation tasks that are traditionally only enabled by server-side programs. Thus, these merits enable semantic wikis to leverage a new web development model that is promising to break the data consumption bottleneck for many light-weight applications scenarios. At a high level, such a development paradigm possesses a couple of characteristics extending those of the social web: ² Rich data modeling: semantic annotation allows content in an application to be structured and explicit about its semantics; in that sense, contents on the web are not only documents, but also data, and semantic wikis can play as a light-weight semantic database for the modeling, storage and retrieval of those data. ² In-browser data processing: users can contribute not only data, but also some data processing scripts using a simple language in the browser; this help improve transparency and flexibility of the application development. ² Social interaction: As data modeling and processing are enabled in the browser, participation for the improvement of data as well as the application itself is open to the user community. ² Lower learning curve: just as wiki script simplifies web page authoring, semantic wiki script simplifies many data modeling and processing tasks thus frees, for many lightweight yet non-trivial scenarios, developers the burden in mastering the whole set of (semantic) web development tools and languages. In this paper, we describe a generic framework for such a web application model, and present two proof-of-concept prototypes, namely RPI Map and CNL (controlled natural language) Wiki based on Semantic MediaWiki (SMW), to demonstrate our approach. The contributions of the paper include: ² A generic light-weight web application development model that possesses the aforementioned characteristics (Section II); ² Working prototypes that embodies the proposed model using the SMW platform (Section III and IV). In particular, we show that templates in SMW is useful in preforming several common data modeling and processing tasks; we also demonstrate the ability of SMW in data mash-up. While our demonstration is only limited to SMW-based implementations, the proposed model does not necessarily limited to SMW or wiki-based implementations. We believe that, as more and more web 2.0 applications are provided with semantic extensions (e.g., Drupal), many results discussed in this paper may be extended to other platforms as well to embrace stronger social interaction, semantic-awareness and flexibility in web application development. II. SEMANTIC WIKI BASED WEB APPLICATION MODEL In this section, we introduce a semantic wiki based Web application model under the broader context of Web application model evolution. A unique feature of this model is the collective scripting on social web, i.e., users can collectively (and often collaboratively) contribute content as well as some scripts to consume (e.g., search, filter, aggregate) those content in the browser. In particular, we choose SMW to exemplify how this model can be used in collaborative, light-weight web application development. A. Comparison of Web Application Models The advance of Web technology drives the evolution of Web application models. Starting from just being able to browse web pages, Web users are now able to control content publishing on the Web with the help of Web 2.0 technologies. Wikis, blogs (e.g., Drupal and Wordpress) and similar online content management systems further provide extensible computing infrastructure that supports inline scripts and/or customizable plugins to facilitate collaborative web application development. Recent advance in social semantic web, such as SMW, now enables users to collaboratively control structured data management. In Figure 1, we compare several web application models including classical models and the emerging modes. In the Conventional Model a Web application, e.g., an online shopping systems, is composed of three clearly-separated major components, namely the web browser, the web server and the backend database and/or the file system. Users of such an application is often provided with limited control for the content in the application from on the web browser, such as browsing or search. The representation, computation and presentation components are primarily hosted on server side and are controlled by webmasters only. Other models have extended the Conventional Model with extra client-side control of data or computation. The AJAX Model [6], which adds the AJAX engine to act as a mediator between the browser and the server, is getting increasing popularity due to its powerful client side computing ability, which improves user experience in both data transfer (e.g., asynchronous data retrival from the server without interfering with page display) and data presentation. For example, a powerful word processing system (e.g., Google Docs) can be built within a browser where the data is actually stored on the Web. It is notable that users can now insert client side plugin into their web browsers to better process online data and interact with web services2. The Wiki-based Model enables end users to directly control some data content and presentation on the server side. For example, many Wikipedia articles are collaboratively maintained by users and complex wiki templates are frequently used to enable advanced presentation (e.g., to render a calendar). A user may also call extensions of a wiki platform (e.g., “parser functions” in MediaWiki, the system used by Wikipedia) to perform certain computation tasks such as string processing, mathematical computation and visualization. It also notable that a wiki page may embed certain external script languages (e.g., JavaScript) for advanced tasks. In addition to the AJAX and the Wiki-based models that increase the user’s control over data processing, the SemWiki (Semantic Wiki)-based model grants users additional control on structured data management and consumption. For example, in Wikipedia, it is not yet possible to assert a structured annotation for a person’s page (even though an infobox template on the page may have already stored some structure data for human reading), or to execute a query that “all European countries that have female government leaders”. Semantic wikis address those limitations by extending wikis with the ability to add and query structured annotations using a relatively simple modeling script (as compared to ontology languages such as RDF or OWL) and query language (as compared to semantic web query languages such as SPARQL). As a result, users are now equipped with increasing power to control data in the application. In particular, the SemWiki model enables a relatively comprehensive in-browser scripting environment such that a light-weight Web applications can be built collectively with maximal transparency on computational logics (all computational scripts are included in wiki pages) 2A famous example of such a client-side script enabler is the Firefox extension “GreaseMonkey” (https://addons.mozilla.org/en-US/firefox/addon/748). and minimal or even no required knowledge on web server configuration. In what follows, we elaborate the components of the SemWiki based model and several design patterns of this model. B. Data Modeling While most Content Management Systems (CMS) store content either in relational databases or in file systems, semantic wikis are often built upon RDF triple stores for storing structured data3. Thus, data in a semantic wiki does not need to be always stored with a pre-defined schema (while it is also possible to do so) as a RDBMS will require. That conforms to the open nature of the Web and enables significant flexibility in data modeling. For instance, it is possible to use semantic wikis to model data in a object-orient manner. We may regard certain types of wiki pages as instances of a object type, and use RDF triples to describe attributes of the object. For example, we may specify that a person object must have attributes “homepage” and “affiliation”, and an instance of person “Li Ding” may a “homepage” value as “http://www.cs.rpi.edu/ dingl” and an “affiliation” value to another object identified by “RPI”. In the following sections, we will further show that in SMW it is also possible to organize data using relational modeling. Please note that, the modeling specification is usually also presented as some semantic wiki pages, thus can be accessed, updated or even deleted same as for other wiki pages. Some semantic wikis (like SMW) not only preserve semantic structure of data, but also provide light-weight query ability (with its role similar to that of SELECT queries in SQL). For example, in SMW it is possible to pose a query ff#show Li Ding j?affiliation gg to find all affiliations of Li Ding. For another example, the SMW query ff#ask j?namegg can find all known instances of Person with their names. C. Data Processing Several MediaWiki extensions provide scripting functionality similar to that of the basic constructs of a programming language. In the following sections we will further show that, combined with templates, semantic annotations and semantic queries, those extensions enable a wide range of light-weight data processing abilities. Some most useful extensions include: ² Variable: General variables are supported by the variable extension4 so that users can name a long expression as a variable and reuse it later in a wiki page. A special type of wiki pages called “template” also allow the use of variables. ² Datatypes: For example, string type is supported by StringFunction5 which provides some common string functions such as string length and concatenation; the 3Some RDF triple stores are built on the top of relational databases. 4http://www.mediawiki.org/wiki/Extension:VariablesExtension 5http://www.mediawiki.org/wiki/Extension:StringFunctions array extension6 provide array operations (e.g., search, sort, and split) and set operations (e.g., union, intersect, and diff). ² Control Flow: The ParserFunctions Extension7 offers many useful parser functions including: (i) expression calculation that evaluates, e.g., mathematical expression like “(1+2)”, and logical expressions like ”(true and false)”; and (ii) conditional statements such as a IF-THEN-ELSE conditional flow. The Loop Extension8 supports loop structures such as WHILE and DO-WHILE. D. User Interface In SMW, many elements of a user interface in an application can be constructed using scripts. For example, the Semantic Forms Extension9 offers a form-based editing interface for users to edit template-based data. Utilizing templates and queries also allow us to control the look-and-feel of the user interface and present the data with various visual elements (e.g., table, picture and tree) In addition, in SMW users can also inject JavaScript code, which can be either provided by as server-side scripts or in some special wiki pages, into a wiki page. By aggregating the data management and data processing features with JavaScript, we are able to design interactive, visualized interfaces for the manipulation of semantically enriched data. E. Strength and Limitations By allowing data modeling, processing and presentation (via a user interface) abilities, semantic wikis provide a transparent platform for light-weight web application development. In particular, such a development model enjoys several advantages: ² Flexibility: While the data and scripts are both stored as wiki pages, users can always read and update them directly in the browser. Thus, the improvement to both the data and the application (as constructed with scripts) becomes a dynamic, highly portable, and easily accessible process. ² Socialization: Semantic wikis inherits the Web 2.0 features natively supported by wikis, in particular the support of social user participation, e.g., user login, collaborative editing, and revision history. This may encourage largescale, collaborative interactions between users. ² Inference Ability: The availability of semantically enriched content in semantic wikis makes it possible to do some inference with data, thus allows potentially better means in the consumption of data (e.g., search and query) On the other hand, it should be noted that the semantic wiki based model, while is favorable in many light-weight development scenarios, also have some limitations for other use cases: ² Efficiency: Semantic wikis often use a triple store for data storage. The state-of-the-art of triple stores has not yet 6http://www.mediawiki.org/wiki/Extension:ArrayExtension 7http://www.mediawiki.org/wiki/Help:Extension:ParserFunctions 8http://www.mediawiki.org/wiki/Extension:Loops 9http://www.mediawiki.org/wiki/Extension:Semantic Forms reached the same level of maturity and scalability as that of relational databases. This may present some efficient problem for applications with very large number of wiki pages. In addition, overhead of parsing and rendering structured data in semantic wiki pages often results in delays in response. Performance tuning for commercial deployment is thus often crucial. ² Modeling Ability: The native modeling support of semantic wikis is usually limited to a subset of RDF or OWL. The page-centric structure of knowledge organization in semantic wikis also makes the modeling of complex knowledge structure and data structure difficult. Thus, building an application that requires very complex data structure or logic with the semantic wiki based model would be rather challenging. This will be further discussed in Section V. ² Safety: As wikis in general are designed to be an open collaborative environment, safety control is usually not natively supported, or with only limited realization. For web applications requiring stronger access control to avoid malicious changes to the content of the application, additional efforts are required to ensure data and application safety10. In the next two sections, we will introduce two concrete examples of web application development based on SMW, namely RPI Map and CNL Wiki. They illustrate, with different usage emphasis, how SMW enables light-weight data modeling, data processing and user interface building with an open, extensible architecture. III. CASE STUDY: RPI MAP This section introduces RPI Map (http://map.rpi.edu) as a SMW-based web application that embodies the general methodology we described in the last section. RPI Map is a campus map application for the Rensselaer Polytechnic Institute (RPI) community. It integrates and visualizes location based information, such as buildings, events and classes, on an interactive map using the Google Map API11. At its core is a Semantic MediaWiki along with a set of mediators that performs data mash-up from multiple external data sources. In what follows, we will describe in details how SMW helps in building RPI Map. A. Data Modeling Template as Schema. In RPI Map, templates play an important role for data organization as they serve as a “virtual” schema for data involved. Main types of data on RPI Map include locations (and its subtypes such as buildings and parking lots), people, events, courses, campus shuttle routes and 10While the safety of a semantic wiki based system is not yet a completely solved problem, we are optimistic due to the success of large-scale wiki systems such as Wikipedia. In Wikipedia, a multi-level user privilege control mechanism has been proven effective to avoid vandalism. Several access control extensions to Mediawiki have been widely used. There are also increasing attention and development efforts of access control extensions in the SMW community. 11http://code.google.com/apis/maps/ Fig. 2. Data Schema of RPI Map real time shuttle positions. Many of these data are published by various individual entities across RPI. For example, event information is published as an RSS feed of the institutional calendar, course information is available from the RPI catalog as a text table, and people information is provided as downloadable vCard files from the RPI directory. To integrate those data in RPI Map, for each external data source there is a mediator (implemented as server-side scripts12) to transform the original data into a form that can be consumed by the wiki platform. Those data will then be linked by various semantic queries based on location information (e.g., the building name) inside of them. We use templates as the general output format of these mediators. Each template corresponds to a type of data in the system and describes a set of attributes that one such data instance must possess. For example, Template:LocationInfo defines a template with parameters of a location, e.g., its name, latitude, longitude and aliases. Together, those template defines schema to organize data in the application. The set of “schema” templates used in RPI Map is shown in Fig. 3. Please note that while RPI Map uses schema-like templates for data modeling, those templates should not be understood as relational schema in the database domain. Those templates provides, on the basis of the triple-based data representation infrastructure of SMW, a higher level abstraction of some related triples. It is not required in RPI Map to have all data fits in a rigidly defined schema, or an instance of a template not having extra data that beyond what the template describes. Such an ability brings additional flexibility in accommodating data from heterogenous data sources. B. Data Computation Stored Query. Many queries are repeatedly used in many different components of RPI Map. For example, one commonly used query is to map a location based on its aliases. 12It is also possible to use client-side script to do data importing, thus users may add other types of data to the system. Fig. 3. RPI Map Main Interface Such a query is stored as a template page Template:Alias, which contains the query

It will be embedded in other pages that need such a query. Thus, templates can also play a role similar to that of stored procedures in a relational database. As those templates can be edited in the browser by users (with some protection mechanism when it is necessary to do so), it is more transparent and easier to access than stored procedures (which are normally hidden behind a server-side DBMS). Data Cleansing. SMW also help in cleaning up corrupt or inaccurate data in the course of integrating data into RPI Map. For example, in transforming people information (in the vCard format) into wiki, the same location (e.g., a person’s office address) may be called in a couple of different names across different branches of the university. In addition, new variations of a location’s name may be discovered when new data is added (e.g., from the event RSS feed on daily basis). A special name recognition template was designed, leveraged by a fuzzy string similarity comparison parser function, to identify the closest known location or its aliases. C. User Interface Query-based Map Generation. Each of the map page on RPI Map is based on some semantic query. For example, the ”Today Event” page is relying a query in the form13:

{{#map_objects: 13For ease of presentation, the query is simplified from the actual query. , , , !John, !value, "Data In Your Face": Push Technology in Perspective, "Drop-In" Publishing with the World Wide Web, "Everything Personal, Not Just Business: " Improving User Experience through Rule-Based Service Customization, "Fill-in-the-Form" Programming, "GeoPlot": spatial data mining on video libraries, "I know what you did last summer": query logs and user privacy, "I wish I were over there": Distributed Execution Protocols for Data Definition in R*, "Is It Within My Reach?" - An Agents Perspective, "Kairos": A Web-Based System for Automatic Generation of Weather Forecasts in Two Languages, Greek-English, "Levels of help, levels of delegation and agent modeling", "Logal": Algorithmic Control Structures for Prolog, "More like these": growing entity classes from seeds, "Physical Negation" Integrating Fault Models into the General Diagnostic Engine, "QUESTION-ANSWER" - A Multipurpose Information System, "Reducing" CLASSIC to Practice: Knowledge Representation Theory Meets Reality, "Small-World" Networks of Mobile Robots, "Squeaky Wheel" Optimization, "Tall", "Good", "High" - Compared to What?, "The Committee for Advanced DBMS Function": Third Generation Data Base System Manifesto, 'Closer' representation and reasoning, 'e-science and cyberinfrastructure: a middleware perspective', (515)509-3927, (518)276-4384, (518)276-4426, (518)276-4430, (518)276-4433, (518)276-4464, (518)577-4517, (De)Composition of Situation Calculus Theories, (How) Is AI Impacting Manufacturing?, (Im)possibility of Safe Exchange Mechanism Design, (Low 3112) will move to Winslow after first class, (Updated March 1998)., (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, August 31 - September 3 2004, (extended abstract), CP-2005 (the Eleventh International Conference on Principles and Practice of Constraint Programming), (mylastname)-at-cs.rpi.edu, Category:*, *** paper retracted by the authors *** (Dynamic Probabilistic Relational Models), , Robotics: State of the Art and Future Challenges , Imperial College Press (2008), ..., /Item-1, /facet: A Browser for Heterogeneous Semantic Web Repositories, /tw.rpi.edu/portal/SummerProgram2009/Student posters, /tw.rpi.edu/portal/SummerProgram2009/Student session 1, /tw.rpi.edu/portal/SummerProgram2009/Student session 2, /tw.rpi.edu/portal/SummerProgram2009/Student session 3, Property:0, 0, 0.1, 0.2, 0.21, email:0000000000, email:0000000001, email:0000000003, email:0000000004, email:0000000005, email:0000000006, email:0000000007, email:0000000008, email:0000000009, email:0000000010, email:0000000011, email:0000000012, email:0000000013, email:0000000014, email:0000000016, email:0000000017, email:0000000018, email:0000000019, email:0000000020, email:0000000021, email:0000000022, email:0000000026, email:0000000027, email:0000000028, email:0000000029, email:0000000030, email:0000000031, email:0000000032, email:0000000033, email:0000000034, email:0000000035, email:0000000036, email:0000000037, email:0000000038, email:0000000039, email:0000000040, email:0000000041, email:0000000042, email:0000000043, email:0000000044, email:0000000045, email:0000000046, email:0000000047, email:0000000048, email:0000000049, email:0000000050, email:0000000051, email:0000000052, email:0000000053, email:0000000054, email:0000000055, email:0000000056, email:0000000057, email:0000000058, email:0000000059, email:0000000060, email:0000000061, email:0000000062, email:0000000063, email:0000000064, email:0000000065, email:0000000066, email:0000000067, email:0000000068, email:0000000069, email:0000000070, email:0000000071, email:0000000072, email:0000000073, email:0000000074, email:0000000075, email:0000000076, email:0000000077, email:0000000078, email:0000000079, email:0000000080, email:0000000081, email:0000000082, email:0000000083, email:0000000084, email:0000000085, email:0000000086, email:0000000087, email:0000000088, email:0000000089, email:0000000090, email:0000000091, 0000000092, email:0000000092, email:0000000093, email:0000000094, email:0000000095, email:0000000096, email:0000000097, email:0000000098, email:0000000099, email:0000000100, email:0000000101, email:0000000102, email:0000000103, email:0000000104, email:0000000105, email:0000000106, email:0000000107, email:0000000108, email:0000000109, email:0000000110, email:0000000111, email:0000000112, email:0000000113, email:0000000114, email:0000000115, email:0000000116, email:0000000117, email:0000000118, email:0000000119, email:0000000120, email:0000000121, email:0000000122, email:0000000123, email:0000000124, email:0000000125, email:0000000126, email:0000000127, email:0000000128, email:0000000129, email:0000000130, email:0000000131, email:0000000132, email:0000000133, email:0000000134, email:0000000135, email:0000000136, email:0000000137, email:0000000138, email:0000000139, email:0000000140, email:0000000141, email:0000000142, email:0000000143, email:0000000144, email:0000000145, email:0000000146, and email:0000000147 … further resultswarning.pngSome subquery has no valid condition. }} where map_objects is a parser function that will automatically generate a map via Google Map API from the result of the “ask” semantic query. The query asks for “all the locations (potentially in their alias forms) of today’s events and their latitude/longitude(some other attributes omitted)”. Please note that the example also demonstrates the use of variables in constructing complex queries. Integration with JavaScript. In RPI Map, JavaScript is intensively used together with wiki scripts. Some simple but useful examples are: ² Popping up a new window with information related to that location, e.g., its full name, picture and service provided; ² Giving control of labeled markers and customizing icons; ² Validation of user-input location information to make sure that they are acceptable before submitted to the server. ² Displaying geographic information in a extant data format, e.g., KML or GeoRSS. The main page of RPI Map is shown in Fig. 3. Part of RPI Map source code for data representation and navigation has been released as the Tetherless Map Extension to Mediawiki14. IV. CASE STUDY: CNL WIKI CNL Wiki15 is another application we developed that conforms to the architecture described in Section 2. The CNL Wiki is motivated at providing an end-user friendly interface for collaborative ontology building. We use a SMW as the application platform exploiting its inherent collaborative nature, high portability and accessability. In addition, we utilizes Controlled Natural Language (CNL) to provide some support for ontology development in OWL [3] with the hope to improve the comprehensibility of generated knowledge statements to end users. In this section, we will introduce in details how SMW enables representation of strongly structured data (i.e., OWL knowledge bases), data computation on such data (e.g., CNL sentence generation), and user interface generation. A. Data Modeling In this subsection, we introduce how SMW templates can be used in modeling structured data and generating semantic data. Modeling Structured Data. In order to accommodate ontology construction in OWL within SMW, we need to address a number of expressivity constraints associated with SMW. Currently, SMW does not provide full support for OWL modeling formalisms, and this introduces a mismatch between the kind of knowledge statements that can be represented in OWL and the knowledge statements that can be created in SMW. 14http://www.mediawiki.org/wiki/Extension:Tetherless Map 15http://tw.rpi.edu/proj/cnl In order to address this limitation, we developed a metamodel extension to SMW, called SMW-mOWL (where “m” stands for meta model). SMW-mOWL represents an OWL ontology using a set of wiki pages, each of which encodes some ontology elements (i.e., classes, properties, individuals and axioms) as wiki template instances. For example, suppose we have an OWL statement saying that “every father is a person that has a child who is also a person”, which can be given in the OWL Abstract Syntax (OWL-AS) as: Class(Father partial Person restriction (hasChild someValuesFrom(Person))) This statement can be broken down into several template instances and represented as SMW pages. For example, on the page “Category:Father”, the above statement in OWL-AS is represented with three template instances: Template:NamedClass Template:NamesClassRelation Template:SomeValuesFrom Thus, each category page represents a single class in OWL along with some axioms about the class. The “Template:NamedClass” describes annotations to this class, such as comments and natural language labels. “Template:NamesClassRelation” describes relationship between two classes (in this case, a class inclusion relationship). “Template:someValuesFrom” represents a restriction that the class in question must satisfy. For complete description of SMW-mOWL including the treatment of complex statements and its limitations, please refer a technical report [2]. Semantic Data Generation. The use of a template-based mechanism for SMW-mOWL also allows us to store the knowledge model in the SMW database (tuple store). For example, an instance of Template: someValuesFrom will be persisted as an instance of the ternary property owl:someValuesFrom in the wiki of which the first element is the class where the template instance resides, the second element is the “on property” parameter, and the third element is the the “on class” parameter. Such persisted data in database can be further consumed by other scripts, e.g. for CNL generation (will be described in the next subsection), or external tools, e.g., a SPARQL query engine. B. Data Computation Once the SMW-mOWL meta-model is persisted in the database, the query language for SMW (SMW-QL) may be used to retrieve specific information from the model, which can be consumed in cascade processing of other wiki scripts. In this subsection, we describe two such usage patterns. Templates as Functions. To query information stored in the SMW tuple store, we use a set of templates to implement query and some additional processing. In that sense, templates are used in a role similar to that of functions in a usual programming language. For example, the Template:CNL.Rabbit.getLabel takes input of a page’s name (denoted as ffpagegg), and do the following queries: ² query if the class is an anonymous class using … further resultswarning.pngSome subquery has no valid condition. The query result will be stored as a variable: it is empty (false) iff the class is an anonymous class. ² If it is an anonymous class, call Template:CNL. Rabbit.Anon to construct its label in the Rabbit CNL. Other wise, return its label by calling a SMW-QL query: … further resultswarning.pngSome subquery has no valid condition. CNL Generation. Queried results from SMW database will be further parsed and processed by a set of CNL generation templates. Currently, we support two CNLs in English, namely, Rabbit [7] and Attempto Controlled English (ACE) [11]. For example, the Template:CNL.Rabbit. getSomeRestrictionAssertion template generates Rabbit CNL sentences about “someValuesFrom” restrictions of a class “ffpagegg”. It preforms the following tasks: ² Use Template:CNL.Rabbit.getLabel to get natural language label of the class in question (i.e., the input parameter “ffpagegg”). ² Use a query to fetch all “someValuesFrom” restrictions related to “ffpagegg”: {{ask: [[

[[:|]], [[:|]], [[:|]], !John, !value, "Data In Your Face": Push Technology in Perspective, "Drop-In" Publishing with the World Wide Web, "Everything Personal, Not Just Business: " Improving User Experience through Rule-Based Service Customization, "Fill-in-the-Form" Programming, "GeoPlot": spatial data mining on video libraries, "I know what you did last summer": query logs and user privacy, "I wish I were over there": Distributed Execution Protocols for Data Definition in R*, "Is It Within My Reach?" - An Agents Perspective, "Kairos": A Web-Based System for Automatic Generation of Weather Forecasts in Two Languages, Greek-English, "Levels of help, levels of delegation and agent modeling", "Logal": Algorithmic Control Structures for Prolog, "More like these": growing entity classes from seeds, "Physical Negation" Integrating Fault Models into the General Diagnostic Engine, "QUESTION-ANSWER" - A Multipurpose Information System, "Reducing" CLASSIC to Practice: Knowledge Representation Theory Meets Reality, "Small-World" Networks of Mobile Robots, "Squeaky Wheel" Optimization, "Tall", "Good", "High" - Compared to What?, "The Committee for Advanced DBMS Function": Third Generation Data Base System Manifesto, 'Closer' representation and reasoning, 'e-science and cyberinfrastructure: a middleware perspective', (515)509-3927, (518)276-4384, (518)276-4426, (518)276-4430, (518)276-4433, (518)276-4464, (518)577-4517, (De)Composition of Situation Calculus Theories, (How) Is AI Impacting Manufacturing?, (Im)possibility of Safe Exchange Mechanism Design, (Low 3112) will move to Winslow after first class, (Updated March 1998)., (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, August 31 - September 3 2004, (extended abstract), CP-2005 (the Eleventh International Conference on Principles and Practice of Constraint Programming), (mylastname)-at-cs.rpi.edu, *, *** paper retracted by the authors *** (Dynamic Probabilistic Relational Models), , Robotics: State of the Art and Future Challenges , Imperial College Press (2008), ..., /Item-1, /facet: A Browser for Heterogeneous Semantic Web Repositories, /tw.rpi.edu/portal/SummerProgram2009/Student posters, /tw.rpi.edu/portal/SummerProgram2009/Student session 1, and /tw.rpi.edu/portal/SummerProgram2009/Student session 2 … further resultswarning.pngSome part "<nowiki></nowiki>" of the query was not understood. Some subquery has no valid condition.

]] |?owl:someValuesFrom |mainlabel=-|format=list|link=none}} ² For each such a restriction, parse its “on property” and “on class” values, use Template:CNL.Rabbit. getLabel to get their natural language labels, and generate a Rabbit sentence using the Rabbit grammar. For instance, for the example in the last subsection, we will have “Every Father has child Person.” A meta model template like Template:NamesClass may call a CNL generation template, such as Template:CNL.Ace.Concept (which in turn calls other templates to construct all CNL sentences about a specific class). Thus, users will get CNL description of a knowledge statement whenever the statement is constructed by form-base editing or by importing an external ontology. Fig. 4 shows such a CNL generation result about a property in an ontology in the Rabbit CNL. C. User Interface Furthermore, structured data representation in SMW also allows user interface construction. This is again facilitated by (semantic) templates. Fig. 4. A property represented in the Rabbit CNL Controlling Page Layout. Similar to conventional wikis, templates in semantic wikis play important roles in controlling page layout. For example, Template:Property, in addition to its role in meta modeling and semantic data generation, also controls look-and-feel of a property, such as ² Content organization (e.g., as table), color schema, font size and other visual elements of a page; ² Linking to the editing interface; ² CNL statements in selected CNL languages, each in a separate table section. Different from conventional templates, a template in SMW is able to use semantic queries so that content from other pages can also be displayed on the page in question. In addition, by separating text content and semantic content of a page, SMW is able to partially reuse a page’s content, and does not need to keep the original layout of the content of other pages. Those features are not available by conventional page inclusion in MediaWiki. Light-weight GUI. Using semantic queries, structured information across multiple pages can be aggregated on one page with graphical representation. One such practice on the CNL Wiki is query-based class hierarchy tree. The template Template:GUI.Tree defines a recursive query that fetches class inclusion relations from a root class in a specified ontology. For example, the follow script will create a tree presentation of all subclasses of Animal in the “Rabbit Ontology”. Template:GUI.Tree Such practice has the potential lending to other forms for display GUI elements as well, such as toolbar, list and menu bar. Form Generation. By utilizing the “Semantic Forms” extension of SMW, some template instance can be edited using a form-based interface. Generation of such forms can be automated from the template definition. Thus, having the template-based OWL meta-model immediately provides us with a light-weight OWL ontology editor within the SMW Fig. 5. Form-based Editor for CNL Wiki environment. Figure 5 illustrates an example of how several forms, each corresponds to a template, can be aggregated on a single wiki page in order to support ontology authoring capabilities. Each form comprises some controls (textboxes, checkboxes, radio buttons, and so on) that support various editing operations. Auto-completion (which may in turn involves some queries) of semantic forms allows sentence editing using existing entities in the ontology. For example, in the textbox of “relation to other classes”, classes (categories) that match user’s partial input will show up for the user to choose. V. DISCUSSION AND RELATED WORK A. Collaborative Web Application Development A recent review [10] on the trends of Web application development has list several popular web development models and showed how they are benefiting from the advance of technologies and the evolution of user behavior. It is shown that collective intelligence can benefit not only content creation but also application development. Our semantic wiki based model clearly exemplifies this trend by its emphasis on general-purposed in-browser scripting that enables users to contribute to the representational, computational and visualization capabilities of the target system. Thus, an application could be extended in a collaborative fashion as the result of activities of multiple individuals, e.g., by adding new information sources via the creation of client-side mediators, creating new datatypes and associated templates, and making data available to other systems by the creation of new export formats. A number of semantic wikis (e.g., AceWiki [13] and IkeWiki [16]) and Semantic Web platforms (e.g., HyperDE [15], [17], social semantic desktop [5], and social webtop[1]) have been used to support web application development following the similar approach as we adopted in the proposed application model. Those efforts share the common characteristics in that they all allow social publishing of semantically enriched data. Our approach goes one step further in that users can be allowed to contribute not only semantically enriched data but also some data consumption scripts, both using a simplified, easy-to-use, browser-based publishing process, to creatively build specific web applications. TBA: some related work on web collaborative programming, weither off-browser (e.g., CVS) or in-borwser, e.g., Bespin (http://labs.mozilla.com/projects/bespin/) or SRI’s Source Mix. B. Users Roles (Sktech Draft) TBA: To discuss the roles of users: the users who design templates, etc. (more like developers) and the users how mainly use forms to add data. As in Wikipedia, the former group will be a small portion of all users. Yet, it is notable that this is different from the traditional programming paradigm where developers and users are distinctive two group, and in the semantic wiki based model the boundary of the two groups are not absolute, and the change of role for a user may be easier. That could lead to lower threshold in development participation. C. Data Modeling (Sktech Draft) Comparison: * [Wiki] To compare the simplicity and ease of the traditional wiki approach and the complexity that is introduced through the semantic capabilities and specifically the semantic mark up.

  • [RDB] To compare with database modeling (database

schemas tend to be too crude and too slow to evolve for many web application)

  • [Heavy-weight model] To discuss simplicity of the

semwiki model is relative to heavy-weight semantic application development.

  • [RDF] To discuss the limitation of the current SMW

modeling w.r.t. general RDF modeling 1) page-centric representation, thus everything has to have a page, thus is limited or very troube to say something about an external URL 2) triples can only be made on their subjects’ pages, thus may require unnecessary traverse when edting; 3) no built-in inference (e.g., rdfs entailment), inference is only via query. Others

  • To discuss the tradeoff between flexibility and ease of use

for form based user interfaces.

  • the meta-modeling approach for modeling (e.g. SMWmOWL

and rules) VI. CONCLUSIONS In this paper, we present a light weight web application model based on semantic wikis. The model utilizes the data modeling, processing and presentation abilities of semantic wikis, which enable better flexibility, socialization and inference ability in building a web application compared with conventional models. We illustrate our approach with two proof-of-concept applications, RPI Map and CNL Wiki, based on Semantic MediaWiki (SMW). Using the two examples, we show that semantic queries and templates are useful building components in realizing many of the data modeling, processing and presentation abilities of semantic wikis. While our demonstrations are based on SMW, the proposed model is not necessarily limited to SMW or wiki-based implementations. For example, many of the idea discussed in the paper may be also applicable to Drupal-based applications, which has seen increasing activities in extending the system with semantic support16. Another example that fits this paradigm on another platform is Freebase Parallax17, an application that allow users to generate queries by facet-based browsing and to store results in a reusable form (e.g., embeddable map) from the Freebase knowledge base. We anticipate that more similar applications will emerge for other types of online structured data. Our future work will focus on the enhancing of the mentioned prototype systems. The extensible architecture of the two applications allows them to evolve with user contributed scripts. For example, in CNL Wiki, we plan to add additional CNL verbalization support by new sets of CNL templates, and the ontology repository management ability by using a set of ontology templates. The ultimate goal is to better demostrate how to create and update an application using semantic wiki thus to encourage the adoption of the proposed semantic wiki based development model. VII. ACKNOWLEDGMENTS We thank Jin Guang Zheng for part of the RPI Map implementation and Zhenning Shangguan for part of the CNL Wiki implementation. REFERENCES [1] J. Bao, L. Ding, D. L. McGuinness, and J. A. Hendler. Towards social webtops using semantic wiki. In International Semantic Web Conference (ISWC), volume Poster Track. 2008. [2] J. Bao, P. R. Smart, D. Braines, G. Jones, and N. R. Shadbolt. A controlled natural language interface for semantic media wiki. In Tetherless World Constellation (RPI) Technical Report, pages TW– 2009–05, 2009. [3] S. Bechhofer, F. van Harmelen, J. Hendler, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider, and L. A. Stein. Owl web ontology language reference. http://www.w3.org/TR/owl-ref/, February 2004. [4] M. Buffa, F. L. Gandon, G. Ereteo, P. Sander, and C. Faron. Sweetwiki: A semantic wiki. J. Web Sem., 6(1):84–97, 2008. [5] S. Decker and M. R. Frank. The networked semantic desktop. In C. Bussler, S. Decker, D. Schwabe, and O. Pastor, editors, WWW Workshop on Application Design, Development and Implementation Issues in the Semantic Web, volume 105 of CEUR Workshop Proceedings. CEURWS. org, 2004. [6] J. J. Garrett. Ajax: A new approach to web applications. adaptivepath. com, February 2005. [Online; Stand 18.03.2008]. [7] G. Hart, M. Johnson, and C. Dolbear. Rabbit: Developing a control natural language for authoring ontologies. In ESWC, pages 348–360, 2008. [8] J. Hendler. Web 3.0 emerging. Computer, 42(1):111–113, 2009. [9] J. A. Hendler and J. Golbeck. Metcalfe’s law, web 2.0, and the semantic web. J. Web Sem., 6(1):14–20, 2008. [10] M. Jazayeri. Some trends in web application development. In FOSE, pages 199–213, 2007. [11] K. Kaljurand and N. E. Fuchs. Bidirectional mapping between owl dl and attempto controlled english. In PPSWR, pages 179–189, 2006. [12] M. Kr¨otzsch, D. Vrandecic, M. V¨olkel, H. Haller, and R. Studer. Semantic wikipedia. J. Web Sem., 5(4):251–261, 2007. [13] T. Kuhn. Acewiki: A natural and expressive semantic wiki. Semantic Web User Interaction Workshop at CHI 2008, 2008. 16http://groups.drupal.org/semantic-web 17http://mqlx.com/ david/parallax/ [14] B. Leuf and W. Cunningham. The Wiki way: quick collaboration on the Web. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2001. [15] D. A. Nunes and D. Schwabe. Rapid prototyping of web applications combining domain specific languages and model driven design. In ICWE, pages 153–160, 2006. [16] S. Schaffert. Ikewiki: A semantic wiki for collaborative knowledge management. In WETICE, pages 388–396, 2006. [17] D. Schwabe and M. R. da Silva. Unifying semantic wikis and semantic web applications. In International Semantic Web Conference (Posters & Demos), 2008.

Personal tools