School of Computer Engineering/PDCC/SANDS Research Group @ NTU Singapore | A*Star

mTeam: A Creative Environment For Mobile Knowledge Workers

Overview

This project investigated a full-spectrum of research topics and technologies to enable collaboration among knowledge workers, including algorithmic foundations and basic research on networked distributed systems which supports upper layer functionalities and ubiquity, social networking (analysis) to facilitate new collaboration opportunities and finally the collaboration tools themselves to carry out the tasks and manage such collaborations.

Status: This project (funded by A*Star for the period of March 2008 - August 2011) has successfully ended, achieving all its scientific objectives. It was carried out in collaboration with Dr. Adam Wierzbicki from Polish Japanese Institute of Information Technology.

Research highlights

Some of the important achievements of this project are described next. The accompanying figure also provides an overview of the different components studied in the project, and the overall mTeam roadmap.


Novel communication primitives for information dissemination: 

We looked into several families of communication and information disseminations problems.  

In collaborative applications, updates need to be propagated to all users in a timely and reliable manner. In ad-hoc settings, this may be non-trivial. We propose a multicast overlay, to determine the topology optimizing the latency, cost and stability of the overlay [C3,J5], taking into account heterogeneity of network usage and resources among the participants. The very notion of such optimized overlay is significantly novel despite a huge body of work on other kinds of multicast overlays. Such multi-objective optimization of multicast overlay topology becomes essential when using it to disseminate updates for collaborative applications in a heterogeneous environment. The implementation of such a multi-objective optimized multicast overlay (MoMo) is available as open-source [http://code.google.com/p/momocomm/].

 Besides updates for explicit collaborative activities within a collaboration session, large scale systems need some basic substrates for information dissemination and notification. Web based publish/subscribe mechanism (RSS feeds) is a standard mechanism, but requires the information consumers to continuously pull content from the source to emulate the push behavior. We looked at mechanisms to disseminate the updates in a peer-to-peer manner among the information consumers to alleviate server load and to enhance scalability of publish/subscribe. Using peers to carry out the dissemination however exposes it to different kind of vulnerabilities (e.g., content pollution, denial-of-service, free-riding, etc.). We investigated mechanisms to seamlessly secure peer-to-peer dissemination of RSS feeds [C1,C2,J4].

RSS based publish/subscribe assumes explicit subscription requests to specific information from a specific (single) source by information consumers. However, often one may not even be aware of what relevant information exists in a virtual community (for example, online social networks). We designed a gossip based mechanism [C15] for many-to-many information dissemination without explicit subscription using social network links and semantic affinity (GoDisco), which attempts to achieve high recall and precision, while keeping spamming effect minimal.

Distributed Hash Tables (DHTs) have emerged as an important primitive for point-to-point communication and distributed indexing. We first developed heterogeneity (of network, workload, etc) adaptive DHT called Oscar [J2], and investigated the different ways such overlays can be bootstrapped [B2]. In FuzzyNet [J3] we explore possibilities to achieve routing guarantees even when the traditionally assumed ring invariant can not always be achieved (which is typical in large-scale dynamic environments). Finally, we design a new, secure, address independent point-to-point communication mechanism inspired by the ideas of DHTs and virtual ring routing (VRR). The security (against attacks like Sybil attack, free riding, etc.) of this novel DHT [C4] (which we call SocialCircle) is derived from the usage of only trusted social link for each overlay hop.

The Polish project partner of the project has been investigating and implementing a generic communication library (P2PP) which can be used to carry out the actual communication across heterogeneous communication channels, thus providing portability and transparency from the underlying network – wired, wireless/cellular network, NAT/firewall issues, etc. The basic concepts of how such portability can be achieved has been summarized in a survey book chapter [B3].

It is worth pointing out at this juncture that while GoDisco and SocialCircle clearly belong to the networked distributed systems compartment of the project, they nevertheless borrow strongly from the social network analysis. Generally speaking, it makes us one of the pioneer research groups working at the interface of distributed information systems and social/collaboration systems – which leads to a new flavor of research, namely “social information systems"


(Quasi-)persistent decentralized storage service:  

Similar to the need of communication mechanisms, reliable storage is another important basic ingredient for any distributed systems. Providing reliable storage becomes particularly challenging if one is to do so without dedicated servers – a scenario which arises under many circumstances, including nomadic and ad-hoc collaboration, but also in decentralized online social networks (DOSNs), massively multi-player peer-to-peer online gaming, etc. Redundancy is essential for reliability in p2p/distributed network storage. While the literature of distributed storage systems is rather rich, a systematic way to compare different approaches had been lacking. We surveyed existing literature [B4] and carried out systematic comparison of existing techniques, along with proposing novel garbage collection mechanisms [C5] for better network resource utilization. The work also resulted in a general purpose p2p storage system simulator (p2p3s) which we provide openly [http://code.google.com/p/p2p3s/] for research community to be able to test and compare any new similar algorithms systematically.  

State-of-the art P2P storage systems do not provide consistency guarantees under multiple reader multiple writer scenarios. Such a feature is desirable as well as necessary for supporting diverse collaborative applications. We developed a highly scalable and efficient algorithm, called consistency maintenance through virtual servers (CMV) [J1]. In this algorithm, consistency of each dynamic file is maintained by a Virtual Server (VS). A file update can only be accepted through the VS to ensure one-copy serializability consistency. The VS of a file is a logical network composed of multiple Replica Peers (RPs) that have replicas of the file. Mathematical analysis is performed for optimal parameter selections that achieve minimum overhead messages for maintaining file consistency. CMV can be used as the synchronization core of our collaboration middleware for real-time as well as asynchronous activities.

Both for security reasons, as well as other considerations such as the access pattern, a special class of peer-to-peer storage, where individuals utilize only their friends's storage devices may become essential. We study the viability of such `restrictive' friend-to-friend storage system [C18].

We study a new threshold-based secret sharing mechanism in a distributed storage system [C6], where users need a means to back up and recover their private keys in a network of untrusted servers. This is important to facilitate anytime anywhere access of data without having to carry all the encryption/decryption keys in person. Using a simple threshold-based secret sharing in such an environment is inadequate since delegates keeping the secret shares may collude to steal the user’s private keys. Adversary may take control of users’ machines, infect them with malicious software, and use them for further attacks. This can lead to an epidemic that makes the whole system eventually collapse. To mitigate this problem, we propose using different techniques to improve the system security: by selecting only the most reliable delegates for keeping these shares and further by encrypting the shares with passwords. We develop a mechanism to select the most reliable delegates based on an effective trust measure. Specifically, relationships among the secret owner, delegate candidates and their related friends are used to estimate the trustworthiness of a delegate. This trust measure minimizes the likelihood of the secret being stolen by an adversary and is shown to be effective against various collusive attacks.


Security & trust issues:

Security issues arise in diverse manners, and have been addressed in the specific contexts. This includes the works on communication primitive [C1,C2,C4] as well as in storage systems [C6].
 

After carrying out a literature survey on existing trust mechanisms [B5] we are currently exploring novel computational trust models (based on stereotypes) and how they can be applied in various social information systems [C7,C10,C11].

Social network analysis for expert identification and team recommendation:

We have surveyed the literature on interdisciplinary match making [B1] in order to better understand how a team can be composed based on coverage of diverse expertise needs as well as social cohesiveness of a team, derived from analysis of multi-dimensional social networks.

In order to validate the basic ideas of expert identification and team recommendation, access to semantic social network data is essential. We have devoted effort in procuring such information from various (open/semi-open) sources, including online creative communities like Kompoz: www.kompoz.com, Wikipedia and academic networks.

Analysis of the Wikipedia dataset has given us some basic understanding of how collaboration works in an open environment [C9,J6]. Some basic analysis of the academic network of NTU faculty members to identify their expertise and social networks to recommend teams has been carried out currently, and is demonstrated using T-RecS [I4,D3,C17]. Online collaboration in browsing and searching for credible information, with the help of experts has been investigated by developing SocialCOBS/FAST [I5,D1,C16]. These works were enabled by some complimentary techniques such as [C12,C13].


Applications & software

We pursued an artifact driven research, demonstrating and validating the research ideas with flagship applications which are also useful in their own right. Experiences from such artifact driven research also provided us essential feedback, which we utilized to refine the ideas, as well as identify new research opportunities inspired by real life challenges. Below we describe three such applications we developed. A fourth flagship application on social library has also intermittently been pursued [B6,C8]


SharedMind: A collaborative mind-mapping software

Available at http://code.google.com/p/sharedmind/ 

The collaboration middleware (and the corresponding underlying components from networked distributed systems) were validated using an open-source collaborative application we made by extending an existing open-source single user mind-mapping application (called FreeMind, freemind.sourceforge.net). The choice of FreeMind was based on the fact that it is a widely used utility and yet simple enough so that we could focus on the validation of our fundamental contributions such as multi-objective optimized multicast overlay to relay updates, real-time and asynchronous consistency maintenance, etc. instead of getting encumbered in the vagaries of a complex application. We provide our collaborative extension as a plugin for FreeMind called SharedMind. The main concepts in realizing SharedMind as a complete collaborative application derived from the basic primitives are explained in [B6] and the application was demonstrated at ICME 2010 [D2]. We extended this work by providing support for mobile devices (Android based), which was awarded (one of three) third prizes in the Singapore Android Developer Challenge 2010.


T-RecS: Multidisciplinary Team Recommendation System

Available at http://sands.sce.ntu.edu.sg/T-RecS/  

T-RecS, besides being a vehicle to carry out, validate and evaluate our multi-dimensional social network analysis driven expert and multidisciplinary team recommendation algorithms, is also an utility which can be used by, and in exploring NTU's researchers according to thier expertise and collaboration (social) connections. Subject to access to relevant datasets, we hope to extend it to include information about researchers from all Singapore based universities and research institutes. Such team-exploration will become a handy tool in searching for and identifying a team of experts to carry out complex tasks, and may be useful when forming teams to apply for research grants as well as when looking for help and collaborators from communities with which an individual does not have direct acquaintance.


SocialCobs: Collaborative browsing and search communities

Available at http://code.google.com/p/socialcobs/

Finding quickly relevant and reliable information on the web is a non-trivial task. While internet search engines do find correct web pages w.r.to a set of keywords, they often cannot ensure the relevance or reliability of the pages' content. A promising approach tries to harness internet users in the spirit of Web 2.0, i.e., making contributions to the web. The idea is that user collaboratively search or browse for information, either directly by communicating or indirectly by adding meta information (e.g., tags, ratings, comments) to web pages. COBS [D1] is an ongoing work to develop such features in a website independent manner using browser add-ons. It brings together the aspects of collaboration as well as expert and team recommendations together. SocialCobs also provides opportunity to validate our ongoing research ideas on trust, p2p storage, gossip based information dissemination, etc. (some of which cannot be validated with the other flag-ship applications), as well as becomes itself a source of new social network information.


Publications

Journals: 

[J1] CMV: File Consistency Maintenance Through Virtual Servers in Peer-to-Peer Systems,
     Zhijun Wang, Anwitaman Datta, Sajal K. Das and Mohan Kumar
     Journal of Parallel and Distributed Computing [Elsevier] 69(4): 360-372 (2009)

[J2] Structured Overlay For Heterogeneous Environments: Design and Evaluation of Oscar,
     Sarunas Girdzijauskas, Anwitaman Datta, Karl Aberer
     Transactions on Autonomous and Adaptive Systems [ACM]

[J3] Fuzzynet: Ringless Routing in a Ring-like Structured Overlay,
     Sarunas Girdzijauskas, Wojciech Galuba, Vasilios Darlagiannis, Anwitaman Datta, Karl Aberer

     Peer-to-Peer Networking and Applications Journal [Springer]

[J4] Attack resilient P2P dissemination of RSS feed
     Xin Liu, Anwitaman
    Peer-to-Peer Networking and Applications Journal
[ Springer]

[J5] Multi-Objective Optimized Multicasting Overlay for Collaborative Applications
     Krzysztof Rzadca, Jackson Tan, Anwitaman Datta

     Computer Networks [Elsevier]

[J6] Wikiteams: How do they achieve success?
     Piotr Turek, Adam Wierzbicki, Radoslaw Nielek, Albert Hupa, Anwitaman Datta

     IEEE Potentials, 30(5), 2011

Conferences & workshops: 

[C1] Really Simple Security for P2P Dissemination of Really Simple Syndication, [poster]
     Anwitaman Datta and Liu Xin

     CoopIS 2008, 16th Cooperative Information Systems Conference (part of OnTheMove Federated Conferences)

[C2] Reliable P2P Feed Delivery,
     Anwitaman Datta, Liu Xin

     CCGrid 2009, 9th IEEE International Symposium on Cluster Computing and the Grid

[C3] Multicast Trees for Collaborative Applications,
     Krzysztof Razdca, Jackson Tan, Anwitaman Datta

     CCGrid 2009, 9th IEEE International Symposium on Cluster Computing and the Grid

[C4] Mapping Social Networks into P2P Directory Service,
     Lukasz Zaczek, Anwitaman Datta

    SocInfo 2009, International Conference on Social Informatics

[C5] Redundancy maintenance and garbage collection strategies in peer-to-peer storage systems,
     Xin Liu, Anwitaman Datta

    SSS 2009, The 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems

[C6] Enabling Secure Secret Sharing in Distributed Online Social Networks,
     Le-Hung Vu, Sonja Buchegger, Anwitaman Datta, and Karl Aberer

    25th Annual Computer Security Applications Conference, (ACSAC'09)

[C7] Using Stereotypes to Identify Risky Transactions in Internet Auctions,
     Xin Liu, Tomasz Kaszuba, Radoslaw Nielek, Anwitaman Datta, Adam Wierzbicki

    IEEE International Conference on Social Computing (SocialCom 2010)

[C8] SoJa: Collaborative Reference Management Using A Decentralized Social Information System,
     Anwitaman Datta

    International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010)

[C9] Learning About the Quality of Teamwork from Wikiteams,
     Piotr Turek, Adam Wierzbicki, Radoslaw Nielek, Albert Hupa, Anwitaman Datta

   IEEE International Conference on Social Computing (SocialCom 2010)

[C10] MetaTrust: Evaluating Reliability of Transactions Using Linear Discriminant Analysis in Large-scale Distributed Systems, [poster]
     Xin Liu, Gilles Tredan, Anwitaman Datta

    International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011)

[C11] A Trust Prediction Approach Capturing Agents’ Dynamic Behavior,
     Xin Liu, Anwitaman Datta

    The 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011

[C12] Semantic Tag Recommendation Using Concept Model, [poster]
     Chenliang Li, Anwitaman Datta, Aixin Sun

    SIGIR 2011

[C13] A Generalized Method for Word Sense Disambiguation based on Wikipedia,
     Chenliang Li, Aixin Sun, and Anwitaman Datta

    The 33rd European Conference on Information Retrieval (ECIR 2011)

[C14] COBS: Realizing Decentralized Infrastructure for Collaborative Browsing and Search,
     Christian von der Weth and Anwitaman Datta

    International Conference on Advanced Information Networking and Applications (AINA 2011)

[C15] GoDisco: Selective Gossip based Dissemination of Information in Social Community based Overlays,
     Anwitaman Datta, Rajesh Sharma

    12th Intl. Conf. on Distributed Computing and Networking (ICDCN 2011)
Awarded best paper in the networking track.

[C16] FAST: Friends Augmented Search Techniques – System Design & Data-Management Issues,
     Christian von der Weth, Anwitaman Datta

    The 2011 IEEE/WIC/ACM International Conference on Web Intelligence, (WI 2011)

[C17] Impact of Expertise, Social Cohesiveness and Team Repetition for Academic Team Recommendation, [poster]
     Anthony Ventresque, Jackson Tan Teck Yong and Anwitaman Datta

    3rd International Conference on Social Informatics, (SocInfo 2011)

[C18] An empirical study of availability in friend-to-friend storage systems, [short paper]
     Rajesh Sharma, Anwitaman Datta, Matteo del Amico, Pietro Michiardi

    IEEE International Conference on Peer-to-Peer Computing, (P2P 2011)

Demo papers at conferences: 

[D1]  COBS: A tool for collaborative browsing and search on the web
     Christian von der Weth, Sally Ang, Anwitaman Datta
     IEEE International Conference on Multimedia & Expo (ICME 2010)

[D2]  SharedMind: A tool for collaborative mind-mapping
     Sally Ang, Krzysztof Rzadca, Anwitaman Datta
     IEEE International Conference on Multimedia & Expo (ICME 2010)

[D3]  T-RecS: Team Recommendation System through Expertise and Cohesiveness
     Anwitaman Datta, Jackson Tan Teck Yong, Anthony Ventresque
     International World Wide Web Conference & Expo (WWW 2011)

Book chapters: 

[B1] Interdisciplinary Matchmaking: Choosing Collaborators by Skill, Acquaintance and Trust,
     Albert Hupa, Krzysztof Rzadca, Adam Wierzbicki and Anwitaman Datta
     Computational Social Networks Analysis: Trends, Tools and Research Advances,
     Springer Computer and Communication Networks series. December 2009

[B2] The gamut of bootstrapping mechanisms for structured overlay,
     Anwitaman Datta
     Handbook of Peer-to-Peer Networking, Springer Verlag. January 2010

[B3] Supporting Collaboration and Creativity Through Mobile P2P Computing,
     Adam Wierzbicki, Anwitaman Datta, Lukasz Zaczek and Krzysztof Rzadca
     Handbook of Peer-to-Peer Networking, Springer Verlag. January 2010

[B4] Maintaining redundancy in peer-to-peer storage systems,
     Anwitaman Datta, Di Wu, Liu Xin, Adam Wierzbicki
     Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications, IGI Global

[B5] Trust and Fairness Management in P2P and Grid systems,
     Adam Wierzbicki, Tomasz Kaszuba, Radoslaw Nielek, Anwitaman Datta
     Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications, IGI Global

[B6] Serverless social software for nomadic collaboration,
     Anwitaman Datta, Krzysztof Rzadca, Sally Ang, Goh Chee Hong
     E-Research Collaboration: Frameworks, Tools and Techniques, Springer-Verlag
 

Artefacts/(sub-)systems implementations:

[I1] MoMo (Multiobjective optimized multicast communication primitive for p2p collaboration): http://code.google.com/p/momocomm/

[I2] P2P3S (P2P storage systems simulator): http://code.google.com/p/p2p3s/

[I3] SharedMind (Collaborative mind-mapping): http://code.google.com/p/sharedmind/

[I4] T-RecS (NTU academic team recommendation demo): http://sands.sce.ntu.edu.sg/T-RecS/

[I5] SocialCobs (Collaborative browsing and search): http://code.google.com/p/socialcobs/


People

Contributors:
Mr. Liu Xin (PhD. student)
Mr. Rajesh Sharma (PhD. student)
Mr. Jackson Tan (project officer)
Ms. Sally Ang (project officer)
Dr. Anthony Ventresque (research fellow)
Dr. Christian von der Weth (research fellow)
Dr. Krzysztof Rzadca (research fellow)

Final year project [students]:
PBDMS: Personal Bibliographic Data Management System [Do Hoang Hai, Goh Chee Hong]
P2P social application for bibliographic content management [Hoon Thien Rong, Adrian Iskandar]
SharedMind: Showcasing P2P nomadic collaboration [Sally Alexia Anggoman Ang]
AndroidMind - Collaboration on the go (on Android) [Ng Wee Yen Jonathan, Goh Shao Xiang]
--- AndroidMind won a 3rd prize in the Singapore Android Developer Challenge 2010
P2P system for RSS feed dissemination [Cui Zhuo, Sonny Budiman Sasaka]
Browser plugin based collaborative online search and social networking [Ritesh Kalra, Chen Wenyao]
Penny saved is penny earned: A peer-to-peer approach to alleviate costs of operating Wikipedia [Sonakshi Kansal]
Android based user location aware mobile social networking application [Toh Chuan Kai]