Reading list Spring 2010

Friday, April 16th, 2010

I have read the first three issues of Communications of ACM of year 2010: January, February, and March. Overall, I have noticed that CACM is aiming at a broader scope, not only CS-topics but also biology and physics. Therefore, nowadays it is more like Science magazine or Nature. But of course in every article there is a computational aspect that connects computer science with another area of knowledge. I found out that cross-disciplinary articles are more engaging than purely technical articles. The nature has lots of secrets that computer science helps reveal.

Jan 2010
Rebuilding for Eternity. Bundler – open source version of Photosynth.
Automated translation of Indian Languages
New Search Challenges and Opportunities
Data in Flight. Implementation of StreamSQL. Stanford streams, MIT Aurora, SQL Stream.
Other people’s data – XIgnite

Last but not least – two articles that discuss Google’s parallel engine – Map-Reduce. I have noticed that CACM contains lots of articles dedicated to Google’s technology, for example there is an article discussing the evolution of Google file system in one of the following issues. At the same time there are no articles from other software giants, for example Microsoft, Apple, or IBM. This is not because those companies do not innovate. Everybody knows that programmers went nuts writing iPhone apps. The reason of Google domination is I believe that amount of sponsor money that it gives to ACM. It is fine, Google has created lots of innovative frameworks but other companies deserve attention as well.

Map Reduce and Parallel DBMSs: Friends or Foes?
MapReduce: A Flexible Data Processing Tool

Feb 2010

The best issue I have ever read! To start with, its cover story is dedicated to new model of computation, quantum algorithms. This topic is not new. When I was an undergraduate student in Russia in late 1990s there was lots of buzz of how quantum algorithms can change the cryptography. With its strong mathematical tradition, Russians were trying to explain quantum algorithms from the number theory point of view. To me it was totally incomprehensible. Or I should say that my mind was more inclined toward an algorithmic perspective of quantum computers. In this article CACM does a great job on explaining the notion of quantum algorithm at the level that was most appropriate to me as a software engineer. It briefly mentions computational complexity challenges and explains how quantum algorithms might help tackle those.

Recent progress in Quantum algorithms

Type Theory comes to age. Aura, Jif for security. Philip Walder
An interview with Michael Rabin

A few billion lines of code later.

Another great article in the same issue! When I was a student (again) but this time in a graduate school in the United States I was lucky to witness the emergence of a new technology – practical bug detection using static analysis. But I will start with a brief introduction on how industrial research is transformed into a widely adopted mature technology.

In my life so far I saw two such events. More experienced people might name a few other cases but here is what I can say. In late 1990s computer graphics has advanced rapidly because of increased processing power. Researchers began experiments with massive amounts of data or images. This is how light field mapping technology was developed simultaneously at several universities as well as at Microsoft and Intel. Its idea is to build a 3D model of an object from a number of images taken with an inexpensive camera. I was lucky to participate in the development of this technology as an undergraduate intern at Intel-Nizhny Novgorod in 2001-2002. However, it was only a research project which was soon abandoned. However, in year 2010 there is a commercialized version of this technology Photosynth that Microsoft has created.

When I joined graduate school in Stony Brook in 2002 application security was a hot research area. Everybody was thinking how to protect the programs against viruses. This is why we have created DIRA – a dynamic protection tool that instrumented programs with additional instructions that made it resilient against buffer overflow attacks. But again, the project was soon abandoned. However, Dawson Engler was able to transform the technology landscape with his static bug finder. In this article he describes his experiences with making commercial tool from a research project.

Software Model Checking takes off
Assessing the Changing US IT R&D Ecosystem

March 2010
Chasing the AIDS virus

Cover story is another must-read article! It explains the mechanics of AIDS virus. I never thought that it can transform itself to avoid the medicine it is exposed to.

Making decisions based on the Preferences of Multiple Agents

This article describes various algorithms of voting with applications to social networks. Very comprehensive discussion.

Engineering the web’s third decade
Orchestrating coordination in pluralistic networks
GFS: Evolution on fast-forward
Global IT management: structuring for scale, responsiveness, and innovation

Reading a few Communications of ACM articles

Friday, January 8th, 2010

During the holidays I have read a number of ACM articles from the issues I received earlier as well as from December 2009 issues that I received a few days ago. The most interesting articles are:

Ready for Web OS? Mentions Sam King, tablet crunch pad.
A Smart Cyberinfrastructure for Research. Microformats, data portability, codeplex, mit breadcrumbs, zune social, livelabs entity extraction.
An Interview with Ping Fu
You Don’t Know a Jack about Software Maintenance
Scratch: Programming for All
Sound Index: Charts For the People, By the People
What Intellectual Property Law Should Learn from Software
The Status of the P versus NP Problem
Just for You. Greg Linden ran personalized news site Findory
The Pathologies of Big Data
CTO Roundtable: Cloud Computing. Animoto on Facebook
Hard-Disk Drives: The Good, the Bad, and the Ugly
Database and Information Retrieval Methods for Knowledge Discovery. MSR Libra, Cimple DBLife, KnowItAll/TextRunner, YAGO WordNet NAGA

Research trend of the year: Parallel Computing

Wednesday, December 30th, 2009

So what were those cool ideas this year? In the last few issues of CACM the topic of parallel computing has received lots of attention. Basically, researchers are saying that lots of time and money have been spent on parallel research but most programmers are still writing single-threaded programs or even if they are multi-threaded they do not scale with the number of processors.

Here are the articles on this topic which I found only in three issues of CACM from September through November 2009:

When I noticed the increased attention to parallel computing I started thinking whether I encountered parallel programming before. When I was an intern at Intel I attended an introductory course to parallel computing during which we were implementing standard algorithms such as sorting on a parallel computer using OpenMP. That was in 2001 or so. Since then I saw OpenMP in the literature every now and then until it suddenly disappeared in 2005. All subsequent articles on parallel computing that I read did not even mention OpenMP as a predecessor of whatever new framework they were dealing with. Thus I felt alleviated when I read an article of an independent writer Face the Inevitable. The experiences of that author are very similar to mine. The author explains the lack of attention to OpenMP with its very specific applications.

A couple of years ago another parallel programming framework was extremely popular but its fate was the same – it felt to oblivion. I mean Google’s MapReduce technology or its open-source version Hadoop. The explanation of its current unpopularity is probably the same – the applications are quite limited.

The authors of the Berkely article at least learned the lessons of the previous frameworks. Their article proposes an application-driven approach. The authors consider a number of potential killer applications of parallel computing. They are using a multi-layered approach. The application writer will need to adopt a number of parallel design patterns. Then the developers of the middle ware will create libraries that implement such design patterns. The target hardware on which these libraries are executed are not specified yet. Possibly, it is a multi-processor computer with homogeneous or heterogeneous processors. The authors propose an FPGA architecture to facilitate flexible experimentation.

Besides the lack of parallel killer app, the ideal parallel hardware is also a moving target. So far, success has been achieved only in special domains. For example, Anton is a biological computer which features long pipelines executing specialized instructions that compute forces of interaction among molecules. This is an exceptional architecture because long pipelines are considered harmful for parallel processors in general. Thus, an ideal parallel computer is something that reseachers have not created yet.

To summarize, after a decade of research on parallel computing it is not clear which paradigm the programmers will accept, which middleware they will use, and on which hardware the programms will get executed. We are entering a new decade with lessons learned from previous failures and lots of ideas on how to design an ideal stack of parallel computing. Thus I think that after 5-10 years we will use parallel programming on the daily basis.

Enjoying my Lifetime membership at ACM

Wednesday, December 30th, 2009

This year I have become ACM Lifetiime Member. The idea is that you pay a certain amount of money and your membership continues as long as you are alive. The membership includes subscription to the print edition of Communications of ACM Magazine, as well as various discounts. For example, I have signed up for an unlimited access to Safari Library for basically half the normal price.

The CACM magazine has been transformed a lot during the last couple of years. To start with, most of its articles are now available online. The design of the magazine has been improved. As earlier, there is a central topic for each issue of the magazine and a number of articles that deal with it. In addition, there are a number of viewpoint articles which describe random issues. The magazine has Research Highlights section in the end which contains a number of research papers, either short or full length. Finally, Virtual Extension of the magazine consists of papers that are available online only. Recent issues of CACM include big overview articles such as Turing lectures or the status of P vs NP problem.

As I am working in the industry CACM seems very interesting to me as it sheds light on computer science from an academic angle. Unfortunately, I do not have time to read every article in every issue of the magazine even though I would like to. For example, I never read Research Highlights not to say Virtual Extenion which I even never looked at. When I was a graduate student we were told to read at least 100 papers every year. I am glad though that my ACM membership allows me to get updated on the main research trends even though I do not get full exposure of the details.

CACM was always good at publishing high-level view articles on a certain issue. For example, if it is computer security then CACM describes policies and human-computer interaction issues, not the inner workings of a particular framework. However, this trend is changing. Nowadays CACM includes lots of practical articles similar to IEEE Computer Magazine. Recently, an article analyzing Conflicker worm has been published in CACM. Another example is an article describing Google Web Toolkit which allows writing client-side applications in Java and deploy them as JavaScript.

To summarize, CACM has transformed itself from a purely academic source of information to dynamic resource which balances cutting-edge industrial reports and innovations from the academic world.

My online library

Thursday, June 18th, 2009

As a part of ACM Professional Membership I subscribed to Safari Library with full access to any number of books at a time, as well as roughcuts and videos. When I was a student at SUNY I was subscribed to a starters edition that gave access to any 10 books in a month. In other words, when you added a book to your bookshelf it had to stay there for at least a month.

Now with a more advanced subscription I was able to look at any book of my choice. I was surprised to find out how many books were added to Safari – around 100 during one month.

Here is a list of books that I already added to my bookshelf. Of course, I only glanced trough most of them, but these are the books that I have read, almost:


  • Web 2.0 Architectures, 1st Edition
    By Duane Nickull; Dion Hinchcliffe; James Governor

  • The Google Way, 1st Edition
    By Bernard Girard

The cost saving opportunity is amazing. Given that a book costs approximately 25 USD the 30 books that I have added would cost 750 USD. But the yearly subscription with full access was only 300 USD thanks to my ACM Professional membership.

Food for thought

Thursday, March 5th, 2009