Summary Abox Ankesh

From Semantic Portal Wiki

Jump to: navigation, search

Presentation given at CSCI 6966 Advanced Semantic Web (Fall 2008)#Lesson 7

refresh


Questions

ID Question Name Answer
Ankesh Summary Abox Gregory Shangguan 1 In Section 3.2 we read "Our goal is to define criteria which are efficient to evaluate using simple queries against relational databases, while balancing the tradeoff between filtering precision and cost..." What do you think is the reason for the author to say this? Seems to me that RDB plays certain roles in the whole ABox reduction procedure. Is this assumption correct? Zhenning Shangguan You are right. The authors are suggesting a method for speedy reasoning (consistency checking in the paper) over large Aboxes stored in RDB. At the same time I would agree with you that the title of the paper doesn't say anything about the work being specific to RDB.
Ankesh Summary Abox Gregory Shangguan 3 In Section 3.4, the paper also writes: "...Finally, retrieving the image in A of an inconsistent partition Ap' is a simple database operation..." A simple DB operation seems a little profound to me, which it seems like it's pretty easy for the author... So, could you please explain how can we achieve this by using a simple DB operation? Zhenning Shangguan It would have to be done exhaustively, i.e. all individuals would have to be considered one by one. Check the table that contains images (f-values) of individuals and roles. Now if these images are part of the partition keep the individuals in original Abox. All these operations can be carried out using join and selection.
Ankesh Summary Abox Gregory Todd Williams 1 Section 4 (and in particular Table 3b) shows that all four of the LUBM test data resulted in an inconsistent filtered Abox, requiring consistency checking images in the original Abox. Do you have any insights as to why the summarization and filtering lead to this problem for all the LUBM data, and not for any of the other Aboxes? Do you think the summarization/filtering algorithms can be improved to avoid this second round of consistency checks? Gregory Todd Williams LUBM is synthetically generated, and as larger data-sets increase the number of individuals increase, but their relationship remains similar. Therefore all LUBM data-sets share a common behavior.
  • I have no idea how inconsistency is reached in summary Abox. I looked at LUBM ontology, and there appears to be no axioms for deducing inconsistencies. No two classes are described as disjoint, there is no all-values restriction, (no inverse-functional or functional props), no cardinality restrictions, no two individuals are explicitly stated as being different. So I think we can deduce a graduatestudent1 to be Professor or Publication as-well, without getting a clash. So it is very surprising that there is even a clash.
  • improving summarization/ filtering- no idea as of now. Filtering transitive role assertions would be worth investigating. But bottom-line to all improvements would be that they should be easily implementable using standard SQL queries.
Ankesh Summary Abox Gregory Todd Williams 2 Section 3.4 describes associating with each role R that is part of a cardinality restriction the maximum number of R-neighbors that any individual a has in A. Does this require running the tableau algorithm, or is this simply a way of saying Gregory Todd Williams I think this computation is done prior to application of tableaux algorithm. It simply refers to cardinality of R-successors initially in the Abox. This value would be used to calculate upper bound using the formula in section 3.2. Precisely it would give the cardinality of P(a). Filtering is done before application of tableaux algorithm. However, I'm not sure if taking the maximum works even when R is transitive.
Ankesh Summary Abox Shangguan 2 In Section 3.4, the paper writes: "Typically, filtering A′ creates distinct partitions, and we apply the tableau algorithm to each partition separately. If all of the partitions are consistent, then we are done. Otherwise, we need to check A..." (actually, it will be "we need to check the image of the inconsistent partition, say image(Ap'), in A" according to the rest of the paragraph). IMHO, the reason that we have to double-check image(Ap') in the original A is that the inconsistency in Ap' might be an artificial inconsistency introduced in the process of summarization. So, my question would be: if we can keep track of whether a specific inconsistency is caused artificially or not, does it mean that we don't have to re-check image(Ap') in A? Zhenning Shangguan I would think so. But how are you going to keep track whether an inconsistency was artificially created? One way, that paper has discussed, is to verify its image in A. Is there other way? I don't think I have understood your question. Please ask again after the presentation.
Fokoue2006summary question 1 by lebo The summary of an Abox takes advantage of redundant assertions w.r.t consistency checking by collapsing individuals that are members of the same concept sets and are /not/ explicitly asserted to be different from each other. The authors note, "Any explicit assertions that two individuals are different from each other are maintained in the summary Abox."
  1. Would an exhaustive enumeration of "differentFrom" for each instance in the Abox render the summary method useless?
Tim Lebo You are right. We would not be able to summarize in that case. In-fact I would bring this up for discussion during my presentation, more so from the point of view of non-existence of unique name assumption.
Summary Abox Ankesh Jesse Weaver Section 3.2 states: "We make the simple observation that if a role R and its inverse R− are not part of any universal or maximum cardinality restrictions, then R(a, b) can never be used to alter the original Abox or detect a clash, so it can be ignored." Looking at the LUBM ontology (http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl), there are no universal or maximum cardinality restrictions. This being the case, it would seem that after filtering, LUBM summary data should be reduced to all single-individual partitions. However, according to Table 3a, that is not the case. Have I overlooked something? Jesse Weaver
Summary Abox Ankesh Joshua Taylor 1 2 Summary Abox defines summary Aboxes, and proves Theorem 2 about summary Aboxes. For example, any Abox A is a summary Abox for itself (where the mapping f is simply the identify function). There is mention of a canonical function f satisfying the converses of (1) and (3), and satisfying (4–6), but I don't see any description of a procedure to compute this f. Can anything be said about the complexity (time, space, lower-bounds) needed to compute this canonical summary? Joshua A. Taylor Given that authors intend to keep it simple, tractable while using RDB, complexity of computing the mapping may not be too high. I wouldn't be able to clearly define an f, but I can try describe the algorithm that can produce the mapping (using simple SQL queries).
  • If in the second last step of the algorithm described below we do not care if concept sets are subset (i.e. only same concept set matters), then we can estimate complexity based on above algorithm. It is surely polynomial, and something like n*n, excluding the time for concept set comparison- depends on RDB.
Summary Abox Ankesh Joshua Taylor 2 2 Summary Abox describes the properties of a canonical function f satisfying the converses of (1) and (3), and satisfying (4–6), and producing a summary Abox. Condition (6) is that "f(a) ≠ f(b) ∈ A’ implies a is the only individual in A mapped to f(a) (same for b)." This approach produces something very different than what model finders for, for example, first-order logic typically produce. The techniques that such model finders employ typically increase the domain size gradually, trying to find interpretations. Thus a model finder presented with ab and cd would probably find an interpretation with two domain of size two that mapped one name from each inequality to one domain element, and the other names to the other domain element (e.g., x = a = d, and y = b = c). More importantly, the results of such model finders, I think, could be considered summary Aboxes, though they would not be canonical summary Aboxes. How many of the techniques described in this paper could be applied to non-canonical summary Aboxes? (I recognize that in 3 Abox Filtering the authors write, "we described filtering techniques on the original Abox first," but this is just "for the purpose of exposition.") Joshua A. Taylor I would request you to repeat the first portion of your questions. I'm afraid I couldn't follow your description completely. w.r.t your question, I think, summary technique and filtering are independent in every sense. They explicitly describe how the filtering techniques (or tableaux algorithm) can be applied to summary Aboxes as well.
Williams Khandelwal Summaries Joshua Taylor 1 This question is for two presentations, and maybe it can be discussed once both presentations have been given. The two papers both look at ways of summarizing a graph structure in order to quicken certain types of queries. I wonder if the presenters, or anyone else, has any thoughts about whether the techniques described in each of these papers might be applied to the problems described in the other, or whether some hybrid approach might be useful for some tasks. Joshua A. Taylor


Attendees

Tim Lebo

Personal tools
Semantic Web Community
Tetherless World constellation
maintenance