Meeting Notes October 14

From Semantic Portal Wiki

Jump to: navigation, search

Contents

Action Items

  • Updates
    • Josh and Shangguan status reports on data-store research
      • Josh still working on his
      • [Shangguan not present]
    • Jin added tutorial page on using SPARQL queries with Google Visualizer
  • Todos
    • Office/Phone Listing on internal pages
      • Everyone should update this list with their contact info
    • Wiki page for ISWC2009 attendance, including starting and ending dates.

Notes

  • Need Wiki Page of people going to ISWC2009
  • Greg and Jesse holding ISWC2009 practice presentations tomorrow (Oct-15-2009)
    • Link to Schedule.
    • Presenters (Location: Winslow 1140):
      • 2:00pm-2:30pm, Greg, "Scalable RDF query processing on clusters and supercomputers"
      • 2:30pm-3:00pm, Jesse, "Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples"
      • 3:00pm, Jesse or Greg, "Scalable Reduction of Large Datasets to Interesting Subsets"
      • [Pending schedules for Shangguan and Jim M.]
    • Revisions tentatively scheduled for next-week Tuesday around 4:00pm
  • New TW Logos are now available
    • Link to Logos
    • Li says that they not yet approved by RPI but should be soon
    • Greg inquires about the logos do not have the "Tetherless World" text
      • Jesse asked [Jim?] but was told that it's OK to put the text on top of the globe.
      • Jesse suggest placing textbox above globe if using power-point.
      • Evan suggest tentatively using the logo on the wiki page.
        • Link to logo. Note, it is only abbreviated (TWC).
  • Medha is getting ready to present. Possibly delay in the hopes that Jim or Deborah will show-up.
  • Jim is absent -- may be surveying someplace somewhere; Deborah is on her way; Peter will not be present.
  • Patrick suggest posting presentations to Meeting page for conference callers to view.
    • Dominic (Meeting Coordinator) acknowledges this and will start sending interested parties e-mail attachments (or links) beginning next-week.

Presentations

  • Medha presents her research (Disclaimer: The presentation notes might not exactly reflect what was presented due to transcription errors):
    • Research has been in progress for over a year.
    • Key motivations:
      • Problems with scalable storing of RDF graphs (especially on disk)
      • Problems with efficient querying over these graphs (especially for 1+ billion triples)
    • Solutions:
      • Possible heuristic for partitioning datasets
      • Dimensionality of RDF data is fixed for a given query (along SPO columns). Making use of this fact can enable scalability through joining along unique subject (S), predicate (P), and object (O) instances. Actually joins can be emulated without materialization of sub-datasets. (?Transcription Error)
      • Uses bitmapping technique relying on distinct SP, OP, PS, OP, [...] (?Transcription Error)
      • Expected usefulness of technique:
        • Positive results because memory consumption can be controlled
    • Filtering distinct object-predicates [...] (?Transcription Error)
    • Practical to compress data because of good compression ratio, not good for querying when dataset is too large (?Transcription Error)
    • Bitmap is compression with GAP-compression.
    • Encoding all information in bitmap becomes difficult for search
    • Advantages of The Medha Algorithm:
      • Efficient compression of data
    • Not efficient/practical to load huge datasets to memory (eg. 1 billion triples), but partial loading will be good.
    • If a bit-vector is too condensed then [...GAP-compression fails...] (?Transcript Error)
      • Encoding with GAP compression [...] (?Transcription Error)
      • Example:
        • [0] 1 1 1 1 1 1 1 (takes 32 bytes) (?Transcription Error)
        • [...[other compression uses 4bytes]...] (?Transcription Error)
    • Li inquires about looking into [?image-compression] [?techniques/tool] like gzip (?Transcript Error)
      • Medha clarifies that different encoding makes it impossible due to nature of Join-queries (?Transcription Error)
    • Future work:
      • Better handling of SPO queries
      • Structure of bitmap allows for easy jumps [?but poor with larger datasets] (?Transcription Error)
    • Questions:
      • Dominic:
        • What programming language is this [algorithm] implemented in?
          • Medha says C/C++
      • Greg:
        • Cautions Medha about name mangling -- suggests using "bound-terms" instead of "bound-variables"
          • Medha acknowledges

Attendees

  • Jie
  • Eric
  • Ankesh
  • Jesse
  • Greg
  • Jaio
  • Alvaro
  • Medha
  • Dominic
  • Josh
  • James
  • Li
  • Jin
  • Gio
  • Rui
  • Evan
  • Patrick
  • Xixi
  • Xian
  • Yongmei
  • Jim M. (via Teleconferencing)
  • Stephan (via Teleconferencing)
Personal tools
Semantic Web Community
Tetherless World constellation
maintenance