Home > cloud computing, linked data > Multi-Word TagCloud on Web N-gram Now

Multi-Word TagCloud on Web N-gram Now

April 29th, 2010

Check out the tagCloud below, can you see why it is interesting? Please compare the two tag clouds generated from the same text (a text corpus from the title of about 2000 data.gov datasets), and see why they are different.

A Multi-word TagCloud produced from 2000 US gov dataset titles

Novel Multi-word TagCloud

Conventional Single-word Tag Cloud

Conventional Single-word TagCloud

Highlights

  • Meaningful Visualization. As you may see from the caption, the first one a “MultiWord TagCloud” while the other is the conventional single-word  TagCloud. The former joints individual words into popular multi-word phrases. With the former tag cloud, I can have a better overview on what data was published at data.gov.
  • Automated Process. The MultiWord TagCloud was not created by human users, but automatically generated by computer program, powered by Microsoft Web N-gram service. We can generate such tag cloud for all existing text document
  • Cloud+Crowd. Broadly, this demo shows the value of the crowd and the cloud I mentioned in my earlier blog, now big data can be tackled by the crowd (text from the entire Web) and the cloud (the high performance computational Web N-gram service).

Behind the Scene

The WWW2010 is really inspiring – making me a productive “engineer” although I came as a researcher. Today I picked up Microsoft Visual Studio and write my first C# program. I was an excellent C++ programmer back to my college time (I wrote ton of code using Visual C++ 4.0 10+ years ago). However, today is not about me being a programmer, but rather announce something that is really cool!  I would also like to thank researchers, Evelyne and Paul from Microsoft Research for their great support. My demo on data.gov data is powered by Microsoft Web N-gram Service.

Cheers,

Li Ding @ RPI,  April 29, 2010

VN:F [1.9.22_1171]
Rating: 6.5/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Multi-Word TagCloud on Web N-gram Now, 6.5 out of 10 based on 2 ratings
Author: Categories: cloud computing, linked data Tags:
  1. April 29th, 2010 at 14:19 | #1

    very cool. will you be providing some sort of APIs that we can use to generate multi-word tag cloud for our data sets?

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • April 29th, 2010 at 15:22 | #2

      I would be happy to, I’m currently programming using C#. So you probably need to send me the dataset and I can twik it on my laptop. Once I get it work on PHP, I may generate an API for that computation. thanks.

      Li

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  2. John Bowyer
    April 29th, 2010 at 22:59 | #3

    Will you be publishing the source code? i’m very interested in learning this.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  3. July 22nd, 2010 at 04:35 | #5

    Interesting. Exactly like the work I’ve published a few months prior. Refer to an online demo http://ontology.csse.uwa.edu.au/research/ and the paper “An Ontology-based Interface for Improving Information Exploration”.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  4. June 3rd, 2011 at 22:21 | #6

    Thank you for your information about Multi-Word TagCloud. This is very useful to readers as additional knowledge is very meaningful.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  5. August 25th, 2011 at 03:17 | #7

    li :
    I would be happy to, I’m currently programming using C#. So you probably need to send me the dataset and I can twik it on my laptop. Once I get it work on PHP, I may generate an API for that computation. thanks.
    Li
    VN:F [1.9.10_1130]please wait…Rating: 0.0/5 (0 votes cast)VN:F [1.9.10_1130]Rating: 0 (from 0 votes)

    Li,

    Did you get a chance to make an REST API out of this? If so, how can I access it?

    Thanks,
    Saqib

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  1. February 27th, 2013 at 01:54 | #1