Process Ontologies Summary Spring 2010

From Semantic Portal Wiki

Jump to: navigation, search
This page has been moved to the new drupal site. You should be editing it at http://tw.rpi.edu/web/project/PEO/Process_Ontologies_Summary_Spring_2010.


Contents

Introduction

Our objective is to define a theory for processes of scientific nature and natural phenomena (SNP) and develop vocabulary around it for declarative specification of processes. To draw analogy, we want to do with processes what RDF, OWL(& OIL & DAML) and rules have done to domain knowledge representation. Before RDF, OWL & rules, data was tightly coupled with schema used for storage in a database, often relational and expensive to change, and the domain logic was mostly hard-coded in procedural programs, which made knowledge maintenance and integration difficult. Even today, many systems have domain logic described in UML, which is an object oriented graphical method of representing knowledge and implemented by procedural code. OWL & rules are declarative knowledge modeling languages, which give lot more flexibility as any general purpose tools can be used, independent of the domain, for creation, maintenance, and for performing reasoning. Even the process logic for SNP is mostly coded in procedural programming languages. We want to come up with a declarative mechanism, based on strong logic foundations, through which domain specific processes can be represented with relative ease and any general purpose process reasoner may be used to reason with it.

There are languages for specifying some (some kinds of) non-SNP processes: industrial processes and (software) services. Industrial processes are often specified as workflows, a sequence of activities with input and outputs interconnected through input and output dependencies. In contrast, services often have a state based specification. A service is typically defined in a semantic web language- OWL-S or DAML-S or WSDL- as pair of conditions, pre and post conditions, over the states. A service may be used when its pre-conditions can be satisfied, and when executed will cause changes such that at its completion the post conditions are satisfied in the new state. Multiple workflows can be joined to generate a new workflow. Similarly multiple services can be composed to make a complex service. <TODO: Describe automatic methods for combining workflows>. Whereas, popular AI planning methods are used for combining services.

Of the two approaches, workflow like and state based description, the latter is a richer form of knowledge (process logic) encoding. Or in other words, it is possible to encode more information using the state based approach. The more the merrier because we then have greater scope for reasoning using the encoded knowledge. However there are at least four motivations for looking beyond the state of the art methods used for web services :

  1. Skepticism on current methods over:
    1. ability to scale planning techniques to the size of SNP data
    2. restrictions on expressiveness imposed by planning techniques.
  2. Interaction: Multiple services may be executed concurrently and they may also be synchronized, but they are non-interactive. i.e. there has been no need to model side effects of services, or other kinds of interactions between multiple services. <TODO: Explain how presence of one SNP may affect the impact of other. There is lot of randomness w.r.t. SNP processes. They may be executed separated in time/ space and therefore are unaffected by other processes but when co-located and occurring concurrently may dampen each other>
  3. While our objective, for now, is same as that of computing composite service, which is that given start and end points we want a net process that can perform/ explain transition from start to the end point, the fundamental difference is that we can pick one service over other based on usefulness which we can not do in case of SNPs. In case of SNPs when a pre-condition is met the process must happen and we don't have the choice to not pick that one. This becomes especially relevant when the SNP processes impact other when simultaneously active.
  4. pre and post conditions are not enough. i.e. in case of SNPs we are not happy in only defining what are the effects of executing the process. We also want to define how the process make changes in the state so that state changes can be represented by scientific methods such as equations, differentials etc. We can use planning techniques even then but its impact on performance must be investigated.

Use Case: Rise in Temperature Due to Volcanic Eruption

Volcanic eruption throws out large volumes of sulpherdioxide (SO2), which is carried to far off places by wind. In presence of water molecules sulpherdioxide combines with water to form sulphuricacid (H2SO4). The formation of H2SO4 results in change of the density of the atmosphere at that location which causes change in the temperature at that location.

We identified three simple processes here:

  • Process 1: Formation of sulphuric acid from H2O and SO2. For this we consider (the unbalanced) chemical reaction H2O + O2 + SO2 → H2SO4.
  • Process 2: Increase in temperature because of increase in the amount of H2SO4 in the atmosphere.
  • Process 3: Volcanic eruption triggered by an earth-quake.

Methodology

We decided to take a state based approach to model processes. Process is defined by 1. pre-condition: initial-state; initial-event, 2. post-condition: final-state; final-event, and 3. connecting-function, the function that connects the initial state/event with final state/event by establishing a relationship between the variables in the pre-condition and those in the post-condition. In other words connecting-function helps the transition from initial state to the final state in the case when the states/events contain variables with complex relationships.

The distinction between a state and an event is that a state is defined by observations at a specific time instant, may be scoped to geo-spatial region, where as event is an activity which happens over a time interval. The observations and events have properties relevant to the process.

The notion of time is important, not just the time of the state or the interval for the event, but the time the process takes to finish. Processes have varying degrees of rates and accounting for the effect of the process without due diligence to that rate may not be justified. However, time in what granularity, should process durations be exact numbers range, and while state is associated with one time instant should we, in practice, consider delta deviations, are some questions that need to be carefully investigated.


Ontology

  • Last Semester, I have encode the process files using RDFs.
  • We created 2 process files.
    • Procees: specifies the classes, properties needed to define a process.
    • ChemicalProcess: uses Process ontology we designed to encode chemical process.

Process Ontology

The process ontology is shown in the diagram, the classes are represented using oval, and the property is represented using arrowed line; where the class that arrow line pointed to is the range of the property and the class that arrow line come from is the domain of the property.


Chemical Process

This example is build on top of the process ontology. In this example, we showed: • using Process ontology, we can encode scientific process o given the initial state, final state, and all necessary data, describe what happens during the process • we can do reasoning using the ontology o reason the status of final state

In this example we are representing a chemical process in the atmosphere that involves volcano eruption: At the beginning, the ChemicalProcess has initial state called AtmosphereState1. At this state, it observes the atmosphere has H2O at “SomeLoc” at “InitTime”(Left hand side of the diagram). At this “InitTime” and location “SomeLoc”, volcano eruption occurs, which adds 10 moles of SO2 to the atmosphere. The whole eruption ends at time “EndTime”(lower side of the diagram). Further, we know we can use the chemical equation: H2O+SO2+O2->H2SO4 to predict what happens after volcano eruption and the chemical reaction cause by the eruption(upper side of the diagram).

Finally, by the information encoded using the ontology, we can infer: the process has a final state which ends at “EndTime” at location “SomeLoc” and has 10 moles of H2SO4.


Need to take care: How does system know observed chemical “H2O” at state “AtmosphereState1” and “SO2” added by the volcano eruption are the same thing described by the equation variables “H2O” and “SO2”.


Process Representation

[NOTE] : The ontology we use here is currently not aligned with the one described above. The final intent is to have only one ontology.

There are two things that we are looking at in this work: process representation and process interpretation. i.e. we need some vocabulary or graphical elements to unambiguously describe processes, and a mechanism to interpret to reason with those representations either through procedural code or, preferably, well known (well studied) theory. We have defined a vocabulary, discussed above, for representing processes, and it shouldn't be difficult to define graphical notations that correspond to these vocabulary. We chose to interpret the representations using logic rules. It should be noted that while the two aspects of representation and reasoning/ interpretation must be related, it is not absolutely necessary that they be tightly tied. i.e. the two may be developed independently. A case in point is RDF query language SPARQL. SPARQL semantics is defined in relational algebra terms, but can be interpreted using non-recursive datalog rules, although the former influenced the syntactical choices in SPARQL (SPARQL is intended to be close to SQL, and SQL can be interpreted as non recursive datalog although its implemented/interpreted as set operations). Likewise, while we chose rules as underlying theory for process descriptions, we should be able to use other theories keeping the basic process ontology the same.

Process Description in N3: H2SO4 Production

:X a proc:Variable .
:Y a proc:Variable .
:Z a proc:Variable .
:F a proc:Variable .

:sulphuricacidprod a proc:Process ;
	a proc:Incremental ; #in that it increases the amount of sulphuric acid
	proc:hasInputState 
		proc:hasObservationsInList
		( [ atm:composition
			[ atm:compound chem:sulpherdioxide ;
		    	  atm:quantity :X ] ]
		  [ atm:composition
			[ atm:compound chem:water ;
		    	  atm:quantity :Y ] ]
		  [ atm:composition
			[ atm:compound chem:oxygen ;
		    	  atm:quantity :Z ] ]
		) ;
	proc:hasOutputState
		[ hasObservation
			[ atm:composition
				[ atm:compound chem:sulpherdioxide ;
				  atm:quantity :F ] ] ] ;
	#proc:SulphuricAcidProduction intends to specify, F = min(X, Y, Z)
	#assuming the equation is SO2 + H2O + O2  = H2SO4, and that 
	#X, Y and Z are quantities in moles.
	proc:variablesRelatedByFunction proc:SulphuricAcidProduction .

Process Encoding in AIR: H2SO4 Production

We haven't decided on the exact semantics of the rule language or the level of expressiveness required. For example we may need to use Answer Set Semantics, which is based on stable model semantics and amongst other things allows for defeasible reasoning i.e. rules can be prioritized and default rules may be specified (default rules hold in general unless some other rule under some special circumstances defeats it. A naive but illustrative example of default rule is that - if you are in Troy, its 11 am, and its anytime of the year, then it is sunny in Troy. This rule fails if it is cloudy and the defeating rule may look like- if you are in Troy, Troy is cloudy, then it is dark in Troy.). In our first attempts we tried representing process description as a rule in a forward chain rule engine AIR. It must be said that negation in AIR is not very expressive.

@prefix proc: <http://tw.rpi.edu/ontologies/process#> .
@prefix atm: <http://tw.rpi.edu/ontologies/process/atmosphere#> .
@prefix chem: <http://tw.rpi.edu/ontologies/process/chemical#> .
@prefix fproc: <http://tw.rpi.edu/process-specification/functions#> .

@forAll :PROCESS .
@forAll :STATE, :EVENT .
@forAll :TIME, :POINT, :qSO2, :qH2O, :qO2, :qH2SO4, :qH2SO4_add, :qH2SO4_final, :TIME_COMP, :OBS_FINAL .

@forAll :STATE, :OBS .
:observationstaterelationship a air:Belief-rule ;
	air:if { @forSome :T, :P, :T1, :P1 .
		 :STATE a proc:State ;
			proc:hasTime :T ;
			proc:hasLocation :P .
		 :OBS a proc:Observation ;
			proc:hasTime :T1 ;
			proc:hasLocation :P1 .
		 (:T :T1) fproc:nearbytimes :True .
		 # NOTE THAT THE FUNCTION called must itself be function of physical process under consideration.
		 # It can be hardcoded, i.e. taken care of by the procedural code translating RDF description of process to AIR
		 (:P :P1) fproc:nearbypoints :True . } ;
	air:then { :STATE proc:hasObservation :OBS } .
		 
:sulphuricacidprod a air:Belief-rule ;
        air:if {@forSome :OBS1, :OBS2, :OBS3, :RATE, :TIME_add, :INTERVAL .
                 :STATE a proc:State ;
                        proc:hasObservation :OBS1, :OBS2, :OBS3 .
                 :OBS1 atm:composition
                 	[ atm:compound chem:sulpherdioxide ;
                          atm:quantity :qSO2 ] .
                 :OBS2 atm:composition
                        [ atm:compound chem:water ;
                          atm:quantity :qH2O ] .
                 :OBS3 atm:composition
                        [ atm:compound chem:oxygen ;
                          atm:quantity :qO2 ] .
                 :OBS4 atm:composition
                        [ atm:compound chem:sulphuricacid ;
                          atm:quantity :qH2SO4 ] .
                 (:qSO2 :qH2O :qO2) fproc:sulphuricacidprod :qH2SO4_add .
                 proc:SulphuricAcidProduction :hasRate :RATE .
                 (:qH2SO4_add :RATE) math:product :TIME_add .
		 (:qH2SO4 :qH2SO4_add) math:sum :qH2SO4_final .
                 (:TIME :TIME_add) math:sum :TIME_COMP .
# In AIR we can not create blank nodes, for preventing infinite reasoning
# Therefore we assume that there will be an observation of all times and we can retrieve URIs of those observations
# We also note Unix Times for easy manipulations
                 :OBS5 a proc:Observation ;
                        proc:hasTime :TIME_COMP .
                        #(:TIME :TIME4) fproc:nearbytimes :True .
		 :EVENT proc:hasInterval :INTERVAL .
			:INTERVAL fproc:spansinterval (:TIME :TIME_COMP) .
	air:then [
		air:assert { :OBS5 atm:composition  
		#AIR doesn't support assertions with blank nodes. the code must be modified to support that.
		#We may be asking for a serious modification here.
                             	[ atm:compound chem:sulphuricacid ;
                                  atm:quantity :qH2SO4_final ] . } ] ,
		 [
		air:assert { :STATE proc:hasObservation
				[ atm:additiontoatmosphere
					[ atm:compound chem:sulphuricacid ;
					  atm:quantity :qH2SO4_add ] ] } ] .

Related Work

References

Personal tools
Semantic Web Community
Tetherless World constellation
maintenance