[an error occurred while processing this directive]

SYSTOR 2011
The 4th Annual International Systems and Storage Conference

May 30 - June 1, 2011
Haifa, Israel

image: IBM and Haifa

Abstracts


Pushing the Boundaries of Distributed Storage Systems
Hank Levy (University of Washington)
Distributed key-value stores have become commonplace both across the Internet and in corporate data centers. This talk will present two recent research projects at UW involving new applications and designs for distributed storage systems. First I will describe Vanish, a self-deleting data system whose goal is to cause data stored in the cloud to self-destruct on its own at a user-specified time, without any action on the part of the user and without needing to trust any single third party to perform the deletion. Second, I will describe Comet – an extensible distributed key-value store motivated by our Vanish experience – which allows clients to customize the behavior of the storage system by injecting "active storage objects" into a key-value store.

This is joint work with Roxana Geambasu, Amit Levy, Tadayoshi Kohno, Steven Gribble, and Arvind Krishnamurthy.

Speaker Bio
Hank Levy is Chairman of the Department of Computer Science & Engineering at the University of Washington in Seattle, where he holds the Wissner-Slivka Endowed Chair. Levy's research involves operating systems, distributed systems, computer architecture, and security. His publications have received over a dozen best-paper awards and two test-of-time awards across those fields. Levy is one of the inventors of simultaneous multithreading, which is used in a number of modern microprocessors (e.g., Intel's "Hyperthreading"). He has co-founded two startups: Skytap, a cloud-computing company, and Performant, a Java performance company (acquired by Mercury in 2003). Hank is a Member of the National Academy of Engineering, a Fellow of the ACM, and a Fellow of the IEEE.


Efficient Monitoring of Large Distributed Systems
Daniel Keren (Haifa University)
In order to ensure the stable and efficient operation of a large distributed system, computer cloud etc., the parameters governing the system's behaviour must be sampled and monitored. Often, this must be done in real time, in order to detect without delay an emerging problem which may cause a malfunction or even a system crash. This problem is an instance of the distributed monitoring problem (also referred to as "distributed triggers"). Typically, such triggers should submit an alert when some kind of system-wide anomaly – e.g. a DDOS attack – occurs.

A major difficulty in the construction of distributed triggers is that, typically, they cannot be locally detected. For example, a sudden increase in the communication or computational overhead in a node may be due to a "legitimate" reason. However, if it occurs in only 20% of the nodes, there may well be a problem. Thus the individual activity profiles at the nodes cannot serve as reliable triggers; the entire system-wide activity should be monitored. Alas, the simplistic solution of centralizing the data and then analyzing it incurs a tremendous communication and computational overhead.

In this talk, a general paradigm for defining optimal local conditions at the nodes, whose violation indicates a system-wide anomaly, will be presented. The challenge is to derive conditions which are correct (an anomaly must result in the violation of at least one local condition), and efficient (as little false alarms as possible).

Joint work with Assaf Schuster, Izchak Sharfman, Guy Sagy, Avishay Livne, Amir Abboud, David Ben-David.

Speaker Bio
Daniel Keren received a Ph.D in computer science from the Hebrew University, and after post-doctoral work at Brown University joined the University of Haifa. Since 2003 he has been working closely with Assaf Schuster's group in the Technion on problems related to monitoring and computation in large distributed system.


Challenges in Building A Commercial Deduplication Storage System
Kai Li (Princeton University)
Since 2001, Data Domain set its mission to replace tape libraries by developing deduplication storage system products for backup data.  Since 2003, Data Domain has launched several deduplication storage appliances and data replication eco-systems for data centers to replace tape libraries.  These products can reduce storage footprints, WAN bandwidth requirements, and power consumptions by an order of magnitude.  Now, deduplication storage systems have become the new standard for online data protection.  In this talk, I will give a retrospective overview of developing a commercial deduplication storage system for data centers, and the challenges in building and deploying a commercially successful deduplication storage product line.

Speaker Bio
Kai Li is a Paul M. Wythes '55, P'86 and Marcia R. Wythes P'86 Professor at Princeton University, where he worked as a faculty member since 1986. Before joining Princeton University, he received his Ph.D. degree from Yale University. His research expertise is in building parallel and distributed systems, deduplication storage systems, and data analysis and search for large datasets. He is an ACM fellow and an IEEE fellow. In 2001, he co-founded Data Domain, Inc., serving in roles as the initial CEO, CTO and Chief Scientist.


IBM Watson and the Jeopardy Challenge
Dafna Sheinwald and David Carmel (IBM Research, Haifa)
Watson is an application of advanced natural language processing, information retrieval, knowledge representation and reasoning, and machine learning technologies to the field of open domain question answering. Watson is built on IBM's DeepQA technology for hypothesis generation, massive evidence gathering, analysis, and scoring. Watson runs on a cluster of 90 IBM Power 750 servers in 10 racks with a total of 2880 POWER7 processor cores and 16 Terabytes of RAM. The POWER7 processor's massively parallel processing capability is an ideal match for Watsons IBM DeepQA software, enabling it to respond within less than 3 seconds.
As a test of its abilities, Watson competed on the television quiz show Jeopardy!
The competition aired in three Jeopardy! episodes, dedicated to this IBM Challenge, running from February 14–16, 2011, and attracting millions of viewers, some of them at "watch parties and events" across North America.

Watson competed against Ken Jennings, the record holder for the longest championship streak and Brad Rutter, the current biggest all-time money winner on Jeopardy!.

Watson passed the test -- he emerged victorious.

In this talk we tell more about the challenge, Watson's architecture and technologies, and IBM Haifa's contributions to them.

For more background information on Watson, see youtube e.g. http://www.youtube.com/watch?v=seNkjYyG3gI












































Content navigation

[an error occurred while processing this directive]