Archive

Archive for the ‘tetherless world’ Category

AGU Fall Meeting 2018

January 30th, 2019

The American Geophysical Union (AGU) Fall Meeting 2018 was the first time I attended a conference that was of such magnitude in all aspects – attendees, arrangements, content and information. It is an overwhelming experience for a first timer but totally worth it. The amount of knowledge and information that one can learn at this event is the biggest takeaway; depends on each person’s abilities but trying to get the most out of it is what one will always aim for.

There were 5 to 6 types of events that were held throughout the day for all 5 days. The ones that stood out for me were the poster sessions, e-lightning talks, oral sessions and the centennial plenary sessions.
The poster sessions helped to see at a glance the research that is going on in the various fields all over the world. No matter how much I tried, I found it hard to cover all the sections that piqued my interest in the poster hall. The e-lightning talks were a good way to strike up a conversation on the topic of the talks and get a discussion going among all the attendees. Being a group discussion structure I felt that there was more interaction as compared to the other venues. The oral sessions were a great place to get to know how people are exploring their areas of interests and the various methods and approaches that they are using for the same. However, I felt that it is hard for the presenter to cover everything that is important and relevant in the given time span. The time constraints are there for a very valid reason but that might lead to someone losing out on leads if the audience doesn’t fully get the concept. Not all presenters were up to the mark. I could feel a stark difference between the TWC presenters (who knew how to get all the right points across) and the rest of the presenters. The centennial plenary sessions were a special this year as AGU is celebrating the centennial year. These sessions highlighted the best of research practices, innovations, achievements and studies. The time slots for this session were very small but the work spoke for itself.

The Exhibit Hall had all the companies and organisations that are in the field or related to it. Google, NASA and AGU had sessions, talks and events being conducted here as well. While Google and NASA were focussing on showcasing the ‘Geo-‘ aspect of their work. AGU was focussing on the data aspect too which was refreshing. They had sessions going on about data from the domain scientists’ point of view. This comes across as fundamental or elementary knowledge to us at TWC but the way they are trying to enable domain scientists to be able to communicate better with data scientists is commendable.  AGU is also working on an initiative called “Make data ‘FAIR’ (Findable Accessible Interoperable Reusable) again’ which is once again trying to spread awareness amongst the domain scientists. The exhibit hall is also a nice place to interact with industry, universities and organisations who have research programs for the doctorate students and postdocs.

In retrospect, I think planning REALLY ahead of time is a good idea so that you know what to ditch and what not to miss. A list of ‘must attend’ could have helped with the decision making process. A group discussion at one of our meetings where everyone shares what they find important, before AGU, could be a good idea. Being just an audience is great and one gets to learn a lot, but contributing to this event would be even better. This event was amazing and has given me a good idea as to how to be prepared the next time I am attending it.

 

VN:F [1.9.22_1171]
Rating: 10.0/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: tetherless world Tags:

AGU Conference: Know Before You Go

January 29th, 2019

If this is your first American Geophysical Union (AGU) conference, be ready! Below are a few pointers for future first-timers.

The conference I attended was hosted in Washington, D.C. at the Walter E. Washington Convention Center during the week of December 10th, 2018. It brought together over 25,000 people. Until this conference, I had not experienced the pleasure and the power of so many like-minds in one space. The experience, while exhausting, was exhilarating!

One of the top universal concerns at the AGU Conference is scheduling. You should know that I was not naïve to the opportunities and scheduling difficulties prior to 2018, my first year of attendance. I had spent the last several months organizing an application development team that successfully created a faceted browsing app with calendaring for this particular conference using live data. Believe me when I say, “Schedule before you go”. Engage domain scientists and past participants about sessions, presentations, events, and posters that are a must-see. There is so much to learn at the conference. Do not miss the important stuff. The possibilities are endless, and you will need the expertise of those prior attendees. Plan breaks for yourself. Use those breaks to wander the poster hall, exhibit hall, or the vendor displays.

Key Elements in Scheduling Your Week

  • Do not front load your week. You need time to explore.
    • Be prepared to alter your existing schedule, as a result.
  • Plan on being exhausted.
  • Eat to fuel your body and your mind.
    • Relax, but not too much.
  • Plan on networking. To do that, you need to be sharp!
    • The opportunities to network will exceed your wildest expectations.
  • Take business cards – your own, and from people you meet.

Finally, take some time to see the city that holds the conference. There are many experiences to be had that will add to your education.

The Sessions

So. Many. Sessions!

There are e-lightning talks. There are oral sessions.  There are poster sessions. There are town hall sessions. There are scientific workshops. There are tutorial talks. There are keynotes. Wow!

The e-lightning talks are exciting. There are lots of opportunity to interact in this presentation mode. The e-lightning talks are held in the Poster Hall. A small section provides chairs for about 15 – 20 attendees, with plenty of standing room only space. This informal session leads to great discussion amongst attendees. Be sure to put one of these in your schedule!

Oral sessions are what you would expect; people working in the topic, sitting in chairs at the front of the room, each giving a brief talk, then, time permitting, a Q&A session at the end. Remember these panels are filled with knowledge. For the oral sessions that you schedule to attend, read the papers prior to attending. More importantly, have some questions prepared.

//Steps onto soapbox//

  1. If you are female, know the facts! (Nature International Journal of Science, 2018)
  2. Females are less likely to ask a question if a male asked a prior question.
  3. Get up there!
  4. Grab the mic!
  5. Ask the question anyway.
  6. Do NOT wait to speak with the presenters until afterwards. They are feeling just as overwhelmed as you are by all of the opportunities available to them at this conference.
  7. Please read the referenced article in bullet #1. The link is provided at the end of this post.

//Steps down from soapbox//

The poster sessions are a great way to unwind by getting in some walking. There are e-posters which are presented on screens provided by AGU or the venue. There are the usual posters as well. The highlights of attending a poster session, besides the opportunity to stretch your legs, include the opportunity to practice meeting new people, asking in-depth questions on topics of interest, talking to people doing the research, and checking out the data being used for the research. You will want to have a notepad with you for the poster sessions. Don’t just take notes; take business cards! Remember, what makes poster sessions special is that they are an example of the latest research that has not, yet, become a published paper. The person doing the research is quite likely the presenter of the poster.

All those special sessions – the town halls, the scientific workshops, the tutorial talks, and keynotes – these are the ones that you ask prior attendees, past participants, and experts on which ones are the must-see. Get them in your schedule. Pay attention. Take notes. Read the papers behind the sessions; if not the papers, the abstracts as a minimum. Have your questions ready before you go!

Timing

This is really important. Do NOT arrive without your time at this conference well planned. To do that you are going to need to spend several weeks preparing; reading papers, studying schedules, writing questions, and more. In order to have a really successful, time-well-spent type of experience, you are going to need to begin preparing for this immense conference by November 1st.

Oh, how I wish I had listened to all the people that told me this!

Put an hour per day in your calendar, from November 1st until AGU Conference Week, to study and prepare for this conference. I promise you will not regret the time you spent preparing.

The biggest thing to remember and the one thing that all attendees must do is:

Have a great time!

 

 

Works Cited

Nature International Journal of Science. (2018, October 17). Why fewer women than men ask questions at conferences. Retrieved from Nature International Journal of Science Career Brief: https://www.nature.com/articles/d41586-018-07049-x

 

VN:F [1.9.22_1171]
Rating: 7.5/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: tetherless world Tags:

TWC at AGU FM 2018

January 22nd, 2019

In 2018, AGU celebrated its centennial year. TWC had a good showing at this AGU, with 8 members attending and presenting on a number of projects.

We arrived at DC on Saturday night, to attend the DCO Virtual Reality workshop organized by Louis Kellogg and the DCO Engagement Team, where research from greater DCO community came together to present, discuss and understand how the use of VR can facilitate and improve both research and teaching. Oliver Kreylos and Louis Kellogg spent various session presenting the results of DCO VR project, which involved recreating some of the visualizations used commonly at TWC, i.e the mineral networks. For a preview of using the VR environment, check out these three tweets. Visualizing mineral networks in a VR environment has yielded some promising results, we observed interesting patterns in the networks which need to be explored and validated in the near future.

With a successful pre-AGU workshop behind us, we geared up for the main event. First thing Monday morning, was the “Predictive Analytics” poster session, which Shaunna Morrison, Fang Huang, and Marshall Ma helped me convene. The session, while low on abstracts submitted, was full of very interesting applications of analytics methods in various earth and space science domains.

Fang Huang also co-convened a VGP session on Tuesday, titled “Data Science and Geochemistry“. It was a very popular session, with 38 abstracts. Very encouraging to see divisions other than ESSI have Data Science sessions. This session also highlighted the work of many of TWC’s collaborators from the DTDI project. Kathy Fontaine convened a e-lightning session on Data policy. This new format was very successfully in drawing a large crowd to the event and enabled a great discussion on the topic. The day ended with Fang’s talk, presenting our findings about the network analysis of samples from the cerro negro volcano.

Over the next 2 days, many of TWC’s collaborators presented, but no one from TWC presented until Friday. Friday though was the busiest day for all of us from TWC. Starting with Peter Fox’s talk in the morning, Mark Parsons, Ahmed Eleish, Kathy Fontaine and Brenda Thomson all presented their work during the day. Oh yeah…and I presented too! My poster on the creation of the “Global Earth Mineral Inventory” got good feedback. Last, but definitely not the least, Peter represented the ESSI division during the AGU centennial plenary, where he talked about the future of Big Data and Artificial Intelligence in the Earth Sciences. The video of the entire plenary can be found here.

Overall, AGU18 was great, other than the talk mentioned above, multiple productive meetings and potential collaboration emerged from meeting various scientists and talking to them about their work. It was an incredible learning experience for me and the other students (for whom this was the first AGU).

As for other posters and talks I found interesting. I tweeted a lot about them during AGU. Fortunately, I did make a list of some interesting posters.

VN:F [1.9.22_1171]
Rating: 9.5/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: +1 (from 1 vote)

WebSci ’17 Tutorial Note– Analyzing Geolocated Data with Twitter

September 22nd, 2017

Speaker:

Prof. Bruno Gonçalves, New York University

(http://www.bgoncalves.com/)

Schedule

09:00 -10:20 theory session

10:30 -12:00 practical session

Theory Session:

GPS-enabled smartphone: provides precise geographic locations

Jan,17 global digital snapshot

Social MEowDia Explained- different behaviors on different social media

Twitter:

Anatomy of a tweet: short (start as a message system), hashtag, how many times shared, timestamp, location (comes from your GPS system), background info—metadata,

Metadata:

Text-content, User, Geo, URL, etc.

Geolocated Tweets:

Follows a user’s geo info over time

GPS Coordinates vs World Population

Smartphone ownership—highest among adults, higher education/ income levels (results from survey)

Market Penetration: larger user group in higher GDP countries

Age Distribution

Demographics: ICWSM’11 375(2011)

Language and Geography: different languages show different distributions among geographic location, for example, Spanish and English distributions in NYC

Multilayer Network:

Retweet- information layers

|

Mention

|

Follower- social layers

Link Function–ICWSM’ 11, 89 (2011)

Cluster—retweets ~= agreement; mention ~= discussion

Retweets and mention have very different meanings

The Strength of Ties: chains of ties

Interviews to find out how individuals found out about opportunities

Mostly from acquaintance or friend of friends

It argued that the degree of overlap of two individual’s social networks varies directly with the strength of their tie to one another.

Neighborhood Overlap

Network Structures: arrows-retweets; cluster-different friendship communities; dots- users; people/user serves as a bridge between communities.

Links: internal, between groups, intermediary, etc.

Groups

Geography

Retweet- information layers

|

Mention

|

 Follower- social layers

|

Geographic location

Twitter follower distance

Locality: measures percentage of a user’s friend who lives in the same country.

Co-occurrences and social ties

Geotagged Flickr Photos

Divide the world into a grid, count number of cells on which two individuals were within a given interval

Measures: share photo within a period of time in the same grid – likelihood of becoming friends

Mobility: school/work—home—vacation—move to different city/country

Airline Flights: in Europe within 24h

Commuting: train, subway, bus, etc.

Realistic Epidemic Spreading

Human Mobility: Statistical Model

Privacy (Sci Rep 3, 1376(2013))

How many indicators we need to identify a unique person.

Mobility and Social Network (PLoS One 9, E92196 (2014))

Geo-Social Properties- Matrix of social behavior over distance: Probability of a link, reciprocity, Clustering, Triangle disparity

Geo-Social Model:

Starting position of user u

Visit a random neighbor                    jump to a new location

New position of u

Model fitting: probability of visiting old friend vs meeting new friend

Human Diffusion: how people are moving around on map (J.R.Sco. Interface 12, 20150473 (2015))

Residents and Tourists

City Communities

Practical Session:

https://github.com/bmtgoncalves/WebSci17

Environment Requirement: anaconda & python

Registering an Application

API basics

The Twitter module provides the OAuth interface, we just need to provide the right credentials.

Best to keep the credentials in a dict and parametrize our calls with the dict keyswitch accounts.

.Twitter(auth) takes an OAuth instance as an argument and returns a Twitter object.

Authenticating with the API

In the remainder of this course, the accounts dict will live inside the twitter_accounts.py file.

4 basic types of objects: tweets, users, entities, places.

Searching for Tweets

.search.tweets(query, count)  https://dev.twitter.com/docs/api/1.1/get/search/tweets

  • query is the content to search for
  • count is the maximum number of results to return (from most recent tweets)

returns dict with a list of ‘statuses’

Social Connections

.friends.ids() and .followers.ids() returns a list of up to 500 of a user’s friends or followers for a given screen_name or user_id.

Results is a dict containing multi-fields.

User Timeline

.statuses.user_timeline() returns a set of tweets posted by a single user.

Important options:
include_rst = ‘true’ to include retweet

Count = 200 is max # of tweets to return in each call

Trim_user = ‘true’ to not include the user information

Max_id = 1234 to include only tweets with an id lower than 1234

Return at most 200 tweets in each call, can get all of a user’s tweets up to 3200 with multiple calls

Social Interaction

Data processing extended from user timeline

NetworkX–networkx_demo.py

High productive software for complex network

Come with anaconda

Simple python interface

Four different types of graphs

  • Graph—undirected graph
  • DiGraph—directed graph
  • MultiGraph—multi-edged graph
  • MultiDiGraph—multi-edged directed graph

Similar interface for all graphs

Nodes can be any type of python object

Growing graph—add nodes, edges, etc.

Graph Properties

  • .nodes() return a list nodes
  • .edges()
  • .degree() return a dict with each node degree .in_degree()/ .out_degree() for DiGraph
  • .is_connected()
  • .is_weakly/strongly_connected()
  • .connected_components()

Snowball Sampling–snowball.py

Commonly used in Social Science and Computer Science

  • Start with a single node
  • Get friends list
  • For each friend get the friend list
  • Repeat for a fixed number of layers or until enough

Generates a connected component graph

Streaming Geocoded data–twitter_location.py

The streaming api provides real time data, subject to filter

Use TwitterStream instead of Twitter object

  • .status.filter(track = 1) while return tweets that matches the query q in real time
  • return generator that you can iterate over
  • .status.filter(locations = bb) will return tweets that occur within the bounding box bb in real time

bb is a comma separated pair of lon/lat coordinates.

Shapefiles

Open specification developed by ESRI, still the current leader in the commercial GIS software

Shapefiles aren’t actual files

But actually a set of files sharing the same name but with different extensions.

The actual set of files changes depending on the contents, but 3 files are usually present:

  • .shp—also commonly referred to as the shapefile contains geometric info
  • .dbf—a simple database containing the feature attribute table
  • .shx—a spatial index

QGIS

Pyshp–hapefile_load.py

Pyshp defines utility functions to load and manipulate shapefiles programmatically.

The shapefile module handles the most common operations:

  • .reader(filename) return a reader object
  • reader.records()/iterRecords()
  • reader.shapes()/iterShapes()
  • reader.shapeRecords()/iterShapeRecords()

shape objects contain several fields:
bbox lower left and upper right x,y coordinates (long/lat)

Simple shapefile plot–plot_shapefile.py

Shapely–shapefile_shape_properties.py

Shaplely defines geometric object under shapely.geometry

                   Points, polygon, multip-polygon, shapes()

And common operations

                   .crosses, .contains, etc..

shape object provides useful field to query a shapes properties:

                    .centroid, .area, .bounds, etc..

Filter Points with a shapefile–shapefile_filter.py

Twitter Places–shapefile_filter_places.py

Twitter defines a “coordinates” filed in tweets

There is also a place field that we glossed over

The place object contains also geographic info, but at a courser resolution than the coordinated filed

Each place has a unique place_id, a bouding_box and some geographical information such as country and full_name.

Places can be of several different types: admin, city, neighborhood, poi

Place Attributes: Key, street_address, phone, post_code, region, ios3, twitter, URL, App:id, etc.

Filter points and places–plot_shapefile_points.py   

Aggregation–shapefile_filter_aggregate.py

VN:F [1.9.22_1171]
Rating: 7.0/10 (1 vote cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
Author: Categories: tetherless world Tags:

Get Off Your Twitter

August 25th, 2017

Web Science, more so than many other disciplines of Computer Science, has a special focus on its humanist qualities – no surprise in that the Web is ultimately an instrument for human expression and cooperation. Naturally, lots of current research in Web Science centers on people and their patterns of behavior, making social media a potent source of data for this line of work.

 

Accordingly, much time has been devoted to analyzing social networks – perhaps to a fault. Much of the ACM’s Web Science ‘17 conference centered on social media; more specifically, Twitter. While it may sound harsh, the reality is that many of the papers presented at WebSci’17 could be reduced to the following pattern:

  1. There’s Lots of Political Polarization
  2. We Want to Explore the Political Landscape
  3. We Scraped Twitter
  4. We Ran (Sentiment Analysis/Mention Extraction/etc.)
  5. and We Found Out Something Interesting About the Political Landscape

Of the 57 submissions included in the WebSci’17 proceedings, 17 mention ‘Twitter’ or ‘tweet’ in the abstract or title; that’s about 3 out of every 10 submissions, including posters. By comparison, only seven mention Facebook, with some submissions mentioning both.

 

This isn’t to demean the quality or importance of such work; there’s a lot to be gained from using Twitter to understand the current political climate, as well as loosely quantifying cultural dynamics and understanding social networks. However, this isn’t the only topic in Web Science worth exploring, and Twitter certainly shouldn’t be the ultimate arbitrator of that discussion. While Twitter provides a potent means for understanding popular sentiment via a well-controlled dataset, it is still only a single service that attracts a certain type of user and is better for pithy sloganeering than it is for deep critical analysis, or any other form of expression that can’t be captured in 140 characters.

 

One of my fellow conference-goers also noticed this trend. During a talk on his submission to WebSci’17, Holge Holtzmann, a researcher from Germany working with Web archives, offered a truism that succinctly captures what I’m saying here: that Twitter ought not to be the only data source researchers are using when doing Web Science.

 

In fact, I would argue that Mr. Holtzmann’s focus, Web archives, could provide a much richer basis for testing our cultural hypotheses. While more old school, Web archives capture a much, much larger and more representative span of the Web from it’s inception to the dawn of social media than Twitter could ever hope to.

 

The winner for Best Paper speaks directly to the new possibilities offered by working with more diverse datasets. Applying a deep learning approach to Web archives, the authors examined the evolution of front-end Web design over the past two decades. Admittedly, I wasn’t blown away by their results; they claimed that their model had generated new Web pages in the style of different eras, but didn’t show an example, which was underwhelming. But that’s beside the point; the point is that this is a unique task which couldn’t be accomplished by leaning exclusively on Twitter or any other social media platform.

 

While I remain critical of the hyper-focus of the Web Science community on social media sites – and especially Twitter – as a seed for its work, I do admire the willingness to wade into cultural and other human-centric issues. This is a rare trait in technological disciplines in general, but especially fields of Computer Science; you’re far more likely to read about gains in deep reinforcement learning than you are to read about accommodating cultural differences in Web use (though these don’t necessarily exclude each other). To that point, the need to provide greater accessibility to the Web for disadvantaged groups and to preserve rapidly-disappearing Web content were widely noted, leaving me optimistic for the future of the field as a way of empowering everyone on the Web.

 

Now time to just wean ourselves off Twitter a bit…

VN:F [1.9.22_1171]
Rating: 9.5/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)