terça-feira, 22 de dezembro de 2009

Announcing our Q4 Research Awards



We do a significant amount of in-house research at Google, but we also maintain strong ties with academic institutions globally, pursuing innovative research in core areas relevant to our mission. One way in which we support academic institutions is the Google Research Awards program, aimed at identifying and supporting world-class, full-time faculty pursuing research in areas of mutual interest.

Our University Relations team and core area committees just completed the latest round of research awards, and we're excited to announce them today. We had a record number of submissions, resulting in 76 awards across 17 different areas. Over $4 million was awarded — the most we have ever funded in a round.

The areas that received the highest level of funding for this round were systems and infrastructure, machine learning, multimedia, human computer interaction, and security. These five areas represent important areas of collaboration with university researchers. We're also excited to be developing more connections internationally. In this round, over 20 percent of the funding was awarded to universities outside the U.S.

Some exciting examples from this round of awards:

Ondrej Chum, Czech Technical University, Large Scale Visual Link Discovery. This project addresses automatic discovery of visual links between image parts in huge image collections. Visual links associate parts of images that share even a relatively small, but distinctive, visual information.

Bernd Gartner, ETH Zurich, Linear Time Kernel Methods and Matrix Factorizations. This project aims to derive faster approximation algorithms for kernel methods as well as matrix approximation problems and leverage these two promising paradigms for better performance on large scale data.

Dawson Engler, Stanford University, High Coverage, Deep Checking of Linux Device Drivers using KLEE + Under-constrained Execution Symbolic execution. This project extends the recently built KLEE, a tool that automatically generates test cases that execute most statements in real programs, so that it allows automatic, deep checking of Linux device drivers.

Jeffrey G. Gray, University of Alabama at Birmingham, Improving the Education and Career Opportunities of the Physically Disabled through Speech-Aware Development Environments. This project will investigate the science and engineering of tool construction to allow those with restricted limb mobility to access integrated development environments (IDEs), which will support programming by voice.

Xiaohui (Helen) Gu, North Carolina State University, Predictive Elastic Load Management for Cloud Computing Infrastructures. This project proposes to use fine-grained resource signatures with signal processing techniques to improve resource utilization by reducing the number of physical hosts required to run all applications.

Jason Hong and John Zimmerman, Carnegie Mellon University, Context-Aware Mobile Mash-ups. This project seeks to build tools for non-programmers to create location and context-aware mashups of data for mobile devices that can present time- and place-approriate information.

S V N Vishwanathan, Purdue University, Training Binary Classifiers using the Quantum Adiabatic Algorithm. The goal of this project is to harness the power of quantum algorithms in machine learning. The advantage of the new quantum methods will materialize even more once new adiabatic quantum processors become available.

Emmett Witchel and Vitaly Shmatikov, University of Texas at Austin, Private and Secure MapReduce. This project proposes to build a practical system for large-scale distributed computation that provides rigorous privacy and security guarantees to the individual data owners whose information has been used in the computation.

Click here to see a full list of this round’s award recipients. More information on our research award program can be found on our website.

terça-feira, 15 de dezembro de 2009

Teaching a Computer to Understand Japanese



On December 7th, we launched our new Japanese voice search system (音声検索), which has been available for various flavors of English since last year and for Mandarin Chinese for the past two months. The initial Japanese system works on the Android platform and also through the Google Mobile App on the iPhone as announced in a Japanese blog and a general explanation on how to get started. For developers who want to make use of the speech recognition backend for their own Android applications there is a public API (recognizer intent API) described here.

Although speech recognition has had a long history in Japan, creating a system that can handle a problem as difficult as voice search is still a considerable challenge. Today, most speech recognition systems are large statistical systems that must learn two models from sets of examples, an acoustic model and a language model. The acoustic model represents (statistically) the fundamental sounds of the language, and the language model statistically represents the words, phrases, and sentences of the language. The acoustic model for Japanese voice search was trained using a large amount of recorded Japanese speech, with the associated transcriptions of the words spoken. The language model for Japanese voice search was trained on Japanese search queries.

While speech recognition systems are surprisingly similar across different languages, there are some problems that are more specific to Japanese. Some of the challenges we faced while developing Japanese voice search included:

  • Spaces in Japanese text
    As we looked at some popular search queries in Japan we saw that Japanese often doesn't have spaces but sometimes it does. For example, if a user searches for Ramen noodles near Tokyo station they will often type: "東京駅 ラーメン" with a space in between Tokyo station and Ramen -- therefore, we would like to display it in this way as well. Getting the spaces right is difficult and we continue working to improve it.
  • Japanese word boundaries
    Word boundaries in Japanese are often not clear and subject to interpretation as most of the time there are not spaces between words. This also makes the definition of the vocabulary (the words that can be recognized theoretically) extremely large. We deal with this problem by finding likely word boundaries using a statistical system which also helps us limit the vocabulary.
  • Japanese text is written in 4 different writing systems
    Japanese text as written today uses Kanji, Hiragana, Katakana & Romaji, often mixed in the same sentence and sometimes in the same word, depending on the definition of a word. Try these example queries to see some interesting cases: "価格.com", "マーボー豆腐", "東京都渋谷区桜丘町26-1". We try to display the output in a way that is most user-friendly, which often means to display it as you would write it down.
  • Japanese has lots of basic characters and many have several pronunciations depending on context
    To be able to recognize a word you need to know its pronunciation. Western languages in general use only ASCII or a slightly extended set of characters which is relatively small (less than 100 total in most cases). For Japanese the number of basic characters is the union of all basic characters from the four writing systems mentioned above, which is several thousands in total. Finding the correct pronunciations for all words in the very large voice search vocabulary is difficult and is often done using a combination of human effort and automatic statistical systems. This is even more difficult in the Japanese case as the number of basic characters is higher and there are a vast number of exceptions, for example consider the case of: "一人" (hitori) versus "一人前" (ichininmae). Although the phrases look very similar they have completely different pronunciations.
  • Encoding issues
    Japanese characters can be written in many encoding systems including UTF-8, Shift_JIS, EUC-JP and others. While at Google we try to use exclusively UTF-8 there are still interesting edge cases to deal with. For example some characters exist in different forms in the same encoding system. Compare for example "カナ" and "カナ" -- they both say "kana" and mean the exact same thing, the first in full-width and the second in half-width. There are numerous similar cases like this in Japanese that make normalization of the text data more difficult.
  • Every speaker sounds different
    People speak in different styles, slow or fast, with an accent or without, have lower or higher pitched voices, etc. To make it work for all these different conditions we trained our system on data from many different sources to capture as many conditions as possible.
The challenges listed above are just a small portion of what we dealt with while building the Japanese voice search system. Over time, we are committed to improve the system as much as and as quickly as possible to make speech, in addition to the keyboard, a user-friendly input modality on mobile devices. We will push a first set of improved models this week.

quinta-feira, 10 de dezembro de 2009

Research Areas of Interest - Multimedia



Recently, Google's research groups reviewed over 140 grant proposals across sixteen different research areas. During this process, we identified a number of strategic research topics. These topics represent critical areas of research for Google in collaboration with our university partners.

We'll be examining several of these topics in future posts but we'd like to begin by raising some of the research challenges we face in our multimedia endeavors:
  • Large scale annotation: How can we learn from large, noisy sets of image/video data to automatically get human-level accurate models for label annotation?
    The images and videos that are available on the web provide massive data sets. We have some very noisy labels on that set, in terms of possible content. We have labels based on popularity of an item when considered for a particular search, on anchor text and other context, and on labels given to other content that is often associated with each item. The challenge is to make use of the sheer volume of available data to improve our recognition models and to carry appearance models from one media type to another. Further, we must be able to handle the variability in appearance and in the labels themselves.
  • Image/Audio/Video Representation: How can we improve our understanding of low level representations of images that goes beyond bag of words modeling?
    Much of the current work in labeling and retrieval is based on fairly simple local descriptions of the content, putting the emphasis on classifier learning from combinations of simple models. While this classifier approach has been useful, we should also examine the basic features that we are using, to see if we can better characterize the content. Providing better inputs into our learning algorithms should reduce the size of the space over which we need to search. Possible examples include shape modeling in images, better texture/color models, and descriptions of soft segmentations of regions.
  • Localization of image-/video-level labels to spatial/temporal portions of the content: Can we automatically associate image and video labels with specific portions of the content?
    The most obvious examples in this area are labels like "dog" and "explosion". However, can we also localize more complex concepts like "waves" or "suspense"? Alternately, can we automatically distinguish between labels, based on how well we are able to localize them to a particular place or time within the content?
  • Large scale matching / Hashing: Can we identify matching techniques to deal with large datasets?
    We need image, video, and audio matching techniques that can effectively deal with large datasets, embedded in high-dimensional descriptor spaces, in sub-linear time. Of special interest are methods that can efficiently handle a wide range of recall/precision needs without massive increases in the data-structure sizes that are used.

We expect these questions to keep us busy for some time.

terça-feira, 8 de dezembro de 2009

Machine Learning with Quantum Algorithms



Many Google services we offer depend on sophisticated artificial intelligence technologies such as machine learning or pattern recognition. If one takes a closer look at such capabilities one realizes that they often require the solution of what mathematicians call hard combinatorial optimization problems. It turns out that solving the hardest of such problems requires server farms so large that they can never be built.

A new type of machine, a so-called quantum computer, can help here. Quantum computers take advantage of the laws of quantum physics to provide new computational capabilities. While quantum mechanics has been foundational to the theories of physics for about a hundred years the picture of reality it paints remains enigmatic. This is largely because at the scale of our every day experience quantum effects are vanishingly small and can usually not be observed directly. Consequently, quantum computers astonish us with their abilities. Let’s take unstructured search as an example. Assume I hide a ball in a cabinet with a million drawers. How many drawers do you have to open to find the ball? Sometimes you may get lucky and find the ball in the first few drawers but at other times you have to inspect almost all of them. So on average it will take you 500,000 peeks to find the ball. Now a quantum computer can perform such a search looking only into 1000 drawers. This mind boggling feat is known as Grover’s algorithm.

Over the past three years a team at Google has studied how problems such as recognizing an object in an image or learning to make an optimal decision based on example data can be made amenable to solution by quantum algorithms. The algorithms we employ are the quantum adiabatic algorithms discovered by Edward Farhi and collaborators at MIT. These algorithms promise to find higher quality solutions for optimization problems than obtainable with classical solvers.

On the hardware side we are collaborating with D-Wave in Vancouver, Canada. D-Wave develops processors that realize the adiabatic quantum algorithm by magnetically coupling superconducting loops called rf-squid flux qubits. This design realizes what is known as the Ising model which represents the simplest model for an interacting many-body system and it can be manufactured using proven chip fabrication methods. Unfortunately, it is not easy to demonstrate that a multi-qubit system such as the D-Wave chip indeed exhibits the desired quantum behavior and experimental physicists from various institutions are still in the process of characterizing the chip.


Layout of the qubits in the C4 Chimera chip employed to train the car detector. The irregular graph structure results from the fabrication process not yet rendering all qubits functional.

Today, at the Neural Information Processing Systems conference (NIPS 2009), we show the progress we have made. We demonstrate a detector that has learned to spot cars by looking at example pictures. It was trained with adiabatic quantum optimization using a D-Wave C4 Chimera chip. There are still many open questions but in our experiments we observed that this detector performs better than those we had trained using classical solvers running on the computers we have in our data centers today. Besides progress in engineering synthetic intelligence we hope that improved mastery of quantum computing will also increase our appreciation for the structure of reality as described by the laws of quantum physics.

The theory paper on which the demonstration is based can be found on the arXiv and a report describing the details of the implementation is here.

segunda-feira, 7 de dezembro de 2009

Celebrating Computer Science Education Week



[cross-posted with the Official Google Blog]

Today kicks off the nation’s first Computer Science Education Week. The goal of this week is to encourage students to learn about the discipline that powers the computers, applications and technology they use everyday. Computer Science Education Week emphasizes that our society's aspirations will be met by individuals who have an increasingly deep understanding of computer technology.

We've been thinking about ways that Google could help with computer science education for several years. After all, our search engine has been used in education since its inception — how many essays, research papers and theses begin with a Google search? Today, we'd like to summarize some of what we've been doing at Google to advance CS education. Our efforts focus on four strategic areas, with an emphasis on computing in core curriculum.

Use of Google tools to support teaching and learning
Having a web-based shared document, spreadsheet or presentation that students in a group or class can all view and edit online has had an enormous impact on collaboration in education. So we provide a free suite of our communication & collaboration applications designed especially for schools and universities. We also used our tools and infrastructure to build and support a community of teachers who have developed classroom content and activities around these applications.

Increasing the access to and quality of Computer Science curriculum
We have many people at Google who know about all areas of computer science, many with backgrounds and experience in education. With this deep base of computer science knowledge, we developed Google Code University to help faculty update their undergraduate computer science curriculum, and the Summer of Code, which gives students the opportunity to develop programs for various open source software projects.

Integrating computing curriculum across K-12 core subjects
A group of Google engineers and K-12 "teaching fellows" is working on building and testing models of curriculum to encourage innovation. These curriculum models revolve around "computational thinking", a problem-solving technique that draws on the thinking and analysis skills that computer scientists use everyday. Our goal is to integrate computational thinking across subject areas in K-12 by connecting these skills, which are already a part of core curriculum, more explicitly to computer science. We're also taking this a step further by integrating simple programming concepts in appropriate areas of core K-12 curriculum, such as algebra. Our hope is that by making computer science more visible and showing its connection to every subject area, students will experience the full power and utility of technology in areas of interest to them. Integrating CS into other subjects will also have the key added benefit of leveling the playing field, so that many more students will have the opportunity to gain a deeper understanding of computing.

Supporting organizations and individuals through community outreach
We've also worked for years with teachers and nonprofits to build early interest in the Science, Technology, Engineering and Math (STEM) fields. Besides providing financial support and sponsorship for many external organizations, we've developed a number of scholarship and intern programs to increase the number of women and underrepresented minorities in STEM and computer science. In addition to these formal programs, every day Googlers all over the world organize visits with students at nearby schools and community centers to teach, present workshops and tech talks, and to share their personal stories on how they became computer scientists and engineers.

We're absolutely delighted to be a co-sponsor of the first Computer Science Education Week. As a company, we've benefited so much from advances in computer science and the creativity of computer scientists. We also know that the next great innovators in computer science are out there, ready to be inspired to create technologies that change our world and benefit our society. We urge our children, parents, teachers and educational institutions to pay more attention to this critical field, and we will continue to do our share.

Join us for the 2010 Google GRAD CS Forum!



[cross-posted with the Google Student Blog]

As part of Google’s ongoing commitment to encouraging students of underrepresented backgrounds in technology to pursue graduate study, we are pleased to host the first annual 2010 Google Graduate Researchers in Academia of Diverse backgrounds (GRAD) CS Forum. This forum will bring together students who are historically underrepresented in the field to connect with one another and with Google.

Up to 75 computer scientists will be invited to an all-expenses paid forum that will run Thursday evening through Saturday afternoon on January 21–23 at Google’s headquarters in Mountain View, CA.

The Google GRAD CS Forum will include technical talks from established researchers – both from Google and universities – and a unique occasion to build and strengthen networks with other emerging researchers. Students will also enjoy tours of the Googleplex, have the opportunity to meet with Google engineers in their focus areas, and have fun exploring the San Francisco Bay Area.

Eligibility Requirements

Applicants must:
  • be a computer science (or related technical discipline) graduate student currently enrolled in a Masters or PhD program at a university in North America
  • demonstrate academic excellence and leadership in the computing field
  • maintain a cumulative GPA of at least 3.3 on a 4.0 scale or 4.3 on a 5.0 scale or equivalent in their current program
The forum is open to all qualified graduate students, and is committed to addressing diversity in our company and in the technology industry. Students who are a member of a group that is historically under-represented in the technology industry are encouraged to apply, including female, Native American, African American and Hispanic students as well as students with disabilities.

Selection Process

Google engineers will select up to 75 attendees based on each applicant’s academic and technical achievements. Evidence of academic achievement and leadership experience should be evident from the resume.

How to Apply

Complete the online application and submit all required documents online. First-time users will be required to register and create an account. Please note that recommendation letters are not required.

Application Deadline: December 12, 2009

Apply now at www.google.com/jobs/students/gradforum.

Note: letters of recommendation are not required

sexta-feira, 4 de dezembro de 2009

Automatic Captioning in YouTube



On November 19, we launched our new automatic captioning and automatic alignment feature for YouTube. These features significantly reduce the effort it takes to create captions for videos on YouTube.

With YouTube expanding its index at a breakneck speed of about 20 hours of new material uploaded each minute, access to this vast body of video material becomes increasingly challenging. This is particularly true for people with hearing disabilities. A 2005 US census showed that 7.8 million people (or about 3 percent of the US population) have difficulty hearing a normal conversation, with 1 million unable to hear at all. Hence, increased accesibility by adding captions to YouTube videos makes the corpus available to a much larger audience.

In addition to expanded accessibility for those with hearing disabilities, the combination of captions with machine translation expands YouTube accessibility across the globe. If a caption track is available, it can be translated automatically in any of the 51 currently available languages. As a result, video content otherwise not accessible due to a language barrier can now be understood by a significantly larger user population.

Although captions are available in YouTube for hundreds of thousands of videos, it remains only a fraction of the the available corpus. Furthermore, only a tiny fraction of the avalanche of new video material getting uploaded is captioned. One reason for this lack of coverage is the effort it takes for a video uploader to generate captions. And this is where our new auto captioning and auto alignment features can benefit our uploaders. Auto-captioning uses automatic speech recognition technology to produce machine generated captions. Auto-alignment requires only a transcript--the uploader no longer has to sync that text with the video stream. To more concisely illustrate the use of these features, check out our help center article or this short video:



Modern-day speech recognition systems are big statistical machines trained on large sets of data. They do the best job recognizing speech in domains similar to their training data. Both the auto captioning and the auto alignment features use the speech recognition infrastructure that underlies Google Voice and Voice Search, but trained on different data. As an intial installment, for YouTube we use models trained on publicly available English broadcast news data. As a result, for now, the new features only work well on English material that is similar in style (i.e. an individual speaker who is speaking clearly).

The auto alignment features is available for all new video uploads, however the scope is limited to English material. The auto captioning feature is initially rolled out to a set of educational partners only. Although this is very limited in scope, the early launch makes the results of the system available to the viewers of this material instantly and it allows us to gauge early feedback which can aid in improving the features. We will release automatic captions more widely as quickly as possible.

Over time, we will work on improving the quality as well as the coverage of these features. Expansion will take place along two axes: additional languages will be made available and within each language we will cover much broader domains (beyond just broadcast news-like material). Since the content available in YouTube is so varied, it is difficult to set a timeline for this expansion. Automatic speech recognition remains challenging, in particular for the varied types of speech and background sounds and noise we see in the YouTube corpus. Therefore, to reach a high level of quality, we need to make advances in core technology. Although this will take time, we are committed to making that happen and to providing the larger community with the benefits of those developments.

terça-feira, 1 de dezembro de 2009

Four Googlers elected ACM Fellows



I'm excited to share that the Association for Computing Machinery (ACM) has just announced that four Googlers have been elected ACM Fellows in its class of 2009. Jeff Dean, Tom Dean, Urs Hoelzle and Yossi Matias were chosen for their achievements in computer science and information technology and for their significant contributions to the mission of the ACM. Here at Google, we take great pride in having a tremendously talented workforce, and the talent of our team is exemplified by the addition of Jeff, Tom, Urs and Yossi to the six other ACM Fellows already at Google.

All of these Googlers are being recognized for successes both inside and outside Google. Urs' and Jeff's achievements are most directly related to innovations made while at Google, specifically in our large data centers and in harnessing their inherent parallel computation and vast storage. Tom and Yossi, on the other hand, were elected more for work done prior to Google — respectively, on how to use prediction in planning, control, and decision-making where there is both uncertainty and time constraints, and on theoretically and practically interesting techniques for analyzing and managing large data sets and data streams.

We at Google congratulate our colleagues. They serve as an inspiration to us and to our colleagues in computer science globally and remind us to continue to push the limits of computing, which has enormous benefits to our field and to society at large.

You can read more about Jeff, Tom, Urs and Yossi's achievements and the reasons for this recognition by the ACM below. The citations are the official ones from the ACM.

Jeff Dean, Google Fellow

For contributions to the science and engineering of large-scale distributed computer systems

Dr. Jeff Dean has made great contributions to the programming and use of loosely-coupled multiprocessing systems and cloud computing. Jeff is probably best known for his work (with Sanjay Ghemawat) on the parallel/distribution computing infrastructure called MapReduce, a tremendously influential programming model for batch jobs on loosely coupled multiprocessing systems. Working with others, Jeff has also been a leading contributor to many other Google systems: the BigTable record storage system, which reliably stores diverse record data records (via portioning and replication) in vast quantities, at least two production real-time indexing systems, and several versions of Google's web serving system. The breadth of Jeff's work is quite amazing: At Digital, he co-developed a leading Java compiler and the Continuous Profiling Infrastructure (DCPI). Beyond this core systems work, Jeff has had exceedingly diverse additional activities; for example, he co-designed Google's first ads serving system, made significant quality improvements to the search system, and even has been involved in user-visible efforts such as the first production version of Google News and the production implementation of Google's machine translation system. Despite his primary accomplishments as a designer and implementer of innovative systems that solve hard problems in a practical way, Jeff also has over 20 publications in peer-reviewed publications, more than 25 patents, and is one of Google's most sought-after public speakers.

Thomas L. Dean, Staff Research Scientist
For the development of dynamic Bayes networks and anytime algorithms

Dr. Tom Dean is known in AI for his work on the role of prediction in planning, control and decision-making where uncertainty and the limited time available for deliberation complicate the problem, particularly his work on temporal graphical models and their application in solving robotics and decision-support problems. His temporal Bayesian networks, later called dynamic Bayes networks, made it possible to factor very large state spaces and their corresponding transition probabilities into compact representations, using the tools and theory of graphical models. He was the first to apply factored Markov decision processes to robotics and, in particular, to the problem of simultaneous localization and map building (SLAM). Faced with the need to solve what were essentially intractable problems in real-time, Dean coined the name "anytime algorithm" to describe a class of approximate inference algorithms and the associated (meta) decision problem of deliberation scheduling to address the challenges of bounded-time decision making. These have been applied to large-scale problems at NASA, Honeywell, and elsewhere. At Google, Tom has worked on extracting stable spatiotemporal features from video and developed new, improved features for video understanding, categorization and ranking. During his twenty-year career as a professor at Brown University, he published four books and over 100 technical articles, while serving terms as department chair, acting vice president for computing and information services, and deputy provost.

Urs Hoelzle, Senior Vice President of Engineering
For the design, engineering and operation of large scale cloud computing systems

Dr. Urs Hoelzle has made significant contributions to the literature, theory, and practice in many areas of computer science. His publications are found in areas such as compilers, software and hardware architecture, dynamic dispatch in processing systems, software engineering and garbage collection. Much of this work took place during his time at Stanford and later at UC Santa Barbara as a member of the faculty. Urs' most significant contribution to computer science and its application is found in his work and leadership at Google. Since 1999 he has had responsibility for leading engineering and operations of one of the largest systems of data centers and networks on the planet. That it has been able to scale up to meet the demands of more than a billion users during the past 10 years is an indication of his leadership ability and remarkable design talent. Urs works best in collaborative environments, as evidenced by his publications and in his work at Google. While it would be incorrect to credit Urs alone for the success of the Google computing and communications infrastructure, his ability to lead a large number of contributors to a coherent and scalable result is strong evidence of his qualification for advancement to ACM Fellow. The philosophy behind Google's system of clustered, distributed computing systems reflects a powerful pragmatic: assume things will break; use replication, not gold-plating, for resilience; reduce power requirements where ever possible; create general platforms that can be harnessed in myriad ways; eschew specialization except where vitally necessary (e.g., no commercial products fit the requirement). Much of this perspective can be attributed to Urs Hoelzle.

Yossi Matias, Director of R&D Center in Israel
For contributions to the analysis of large data sets and data streams

Dr. Yossi Matias has made significant contributions to the analysis of large data sets and data streams. He pioneered (with Phillip Gibbons) a new research direction into the study of small-space (probabilistic) “synopses” of large data sets, motivating their study and making key contributions in this area. Yossi’s 1996 paper (with Noga Alon and Mario Szegedy) won the 2005 Gödel Prize, the top ACM prize in Theoretical Computer Science, awarded annually. The award citation describes the paper as having “laid the foundations of the analysis of data streams using limited memory." Further, “It demonstrated the design of small randomized linear projections, subsequently referred to as ‘sketches,’ that summarize large amounts of data and allow quantities of interest to be approximated to user-specified precision.” Additionally, Yossi has made several key contributions to lossless data compression of large data sets, including a “flexible parsing” technique that improves upon the Lempel-Ziv dictionary-based compression algorithm, and novel compression schemes for images and for network packets. Large scale data analysis requires effective use of multi-core processors. For example, his JACM paper (with Guy Blelloch and Phillip Gibbons) provided the first provably memory- and cache-efficient thread scheduler for fine-grained parallelism. In addition to his academic and scientific impact, Yossi has been heavily involved in the high tech industry and in technology and product development, pushing the commercial frontiers for analyzing large data sets and data streams. He is also the inventor on 23 U.S. patents. Yossi joined Google in 2006 to establish the Tel-Aviv R&D Center, and to be responsible for its strategy and operation. Yossi has overall responsibility for Google R&D and technology innovation in Israel.

segunda-feira, 23 de novembro de 2009

Explore Images with Google Image Swirl



Earlier this week, we announced the Labs launch of Google Image Swirl, an experimental search tool that organizes image-search results. We hope to take this opportunity to explain some of the research underlying this feature, and why it is an important area of focus for computer vision research at Google.

As the Web becomes more "visual," it is important for Google to go beyond traditional text and hyperlink analysis to unlock the information stored in the image pixels. If our search algorithms can understand the content of images and organize search results accordingly, we can provide users with a more engaging and useful image-search experience.

Google Image Swirl represents a concrete step towards reaching that goal. It looks at the pixel values of the top search results and organizes and presents them in visually distinctive groups. For example, in ambiguous queries such as "jaguar," Image Swirl separates the top search results into categories such as jaguar the animal and jaguar the brand of car. The top-level groups are further divided into collections of subgroups, allowing users to explore a broad set of visual concepts associated with the query, such as the front view of a Jaguar car or Eiffel Tower at night or from a distance. This is a distinct departure from the way images are ranked by the Google Similar Images, which excels at finding images very visually similar to the query image.



No matter how much work goes into engineering image and text features to represent the content of images, there will always be errors and inconsistencies. Sometimes two images share many visual or text features, but have little real-world connection. In other cases, objects that look similar to the human eye may appear drastically different to computer vision algorithms. Most difficult of all, the system has to work at Web Scale -- it must cover a large fraction of query traffic, and handle ambiguities and inconsistencies in the quality of information extracted from Web images.

In Google Image Swirl, we address this set of challenges by organizing all available information about an image set into a pairwise similarity graph, and applying novel graph-analysis algorithms to discover higher-order similarity and category information from this graph. Given the high dimensionality of image features and the noise in the data, it can be difficult to train a monolithic categorization engine that can generalize across all queries. In contrast, image similarities need only be defined for similar enough objects and trained with limited sets of data. Also, invariance to certain transformations or typical intra-class variation can be built into the perceptual similarity function. Different features or similarity functions may be selected, or learned, for different types of queries or image contents. Given a robust set of similarity functions, one can generate a graph (nodes are images and edges are similarity values) and apply graph analysis algorithms to infer similarities and categorical relationships that are not immediately obvious. In this work, we combined multiple sources of similarity such as those used in Google Similar Images, landmark recognition, Picasa's face recognition, anchor text similarity, and category-instance relationships between keywords similar to that in WordNet. It is a continuation of our prior effort [paper] to rank images based on visual similarity.

As with any practical application of computer vision techniques, there are a number of ad hoc details which are critical to the success of the system but are scientifically less interesting. One important direction of our future work will be to generalize some of the heuristics present in the system to make them more robust, while at the same time making the algorithm easier to analyze and evaluate against existing state-of-the-art methods. We hope that this work will lead to further research in the area of content-based image organization and look forward to your feedback.

UPDATE:  Due to the shutdown of Google Labs, this service is longer active.

sexta-feira, 13 de novembro de 2009

The 50th Symposium on Foundations of Computer Science (FOCS)



The 50th Annual Symposium on Foundations of Computer Science (FOCS) was held a couple of weeks ago in Atlanta. This conference (along with STOC and SODA) is one of the the major venues for recent advances in algorithm design and computational complexity. Computation is now a major ingredient of almost any field of science, without which many of the recent achievements would not have happened (e.g., Human Genome decoding). As the 50th anniversary of FOCS, this event was a landmark in the history of foundations of computer science. Below, we give a quick report of some highlights from this event and our research contribution:
  • In a special one-day workshop before the conference, four pioneer researchers of theoretical computer science talked about historical, contemporary, and future research directions. Richard Karp gave an interesting survey on "Great Algorithms," where he discussed algorithms such as the simplex method for linear programming and fast matrix multiplication; he gave examples of algorithms with high impact on our daily lives, as well as algorithms that changed our way of thinking about computation. As an example of an algorithm with great impact on our lives, he gave the PageRank algorithm designed by Larry and Sergey at Google. Mihalis Yannakakis discussed the recent impact of studying game theory and equilibria from a computational perspective and discussed the relationships between the complexity classes PLS, FIXP, and PPAD. In particular he discussed completeness of computing pure and mixed Nash equilibria for PLS, and for FIXP and PPAD respectively. Noga Alon gave a technical talk about efficient routing on expander graphs, and presented a clever combinatorial algorithm to route demand between multiple pairs of nodes in an online fashion. Finally, Manuel Blum gave an entertaining and mind-stimulating talk about the potential contribution of computer science to the study of human consciousness, educating the community on the notion of "Global Workspace Theory."
  • The conference program included papers in areas related to algorithm and data structure design, approximation and optimization, computational complexity, learning theory, cryptography, quantum computing, and computational economics. The best student paper awards went to Alexander Shrstov and Jonah Sherman for their papers "The intersection of two halfspaces has high threshold degree" and "Breaking the multicommodity flow barrier for O(sqrt(log n))-approximations to sparsest cut." The program included many interesting results like the polynomial-time smoothed analysis of the k-means clustering algorithm (by David Arthur, Bodo Manthey and Heiko Roeglin), and a stronger version of Azuma's concentration inequality used to show optimal bin-packing bounds (by Ravi Kannan). The former paper studies a variant of the well-known k-means algorithm that works well in practice, but whose worst-case running time can be exponential. By analyzing this algorithm in the smoothed analysis framework, the paper gives a new explanation for the success of the k-means algorithm in practice.
  • We presented our recent result about online stochastic matching in which we improve the approximation factor of computing the maximum cardinality matching in an online stochastic setting. The original motivation for this work is online ad allocation which was discussed in a previous blog post. In this algorithm, using our prior on the input (or our historical stochastic information), we compute two disjoint solutions to an instance that we expect to happen; then online, we try one solution first, and if it fails, we try the the other solution. The algorithm is inspired by the idea of "power of two choices," which has proved useful in online load balancing and congestion control. Using this method, we improve the worst-case guarantee of the online algorithm past the notorious barrier of 1-1/e. We hope that employing this idea and our technique for online stochastic optimization will find other applications in related stochastic resource allocation problems.
The FOCS conference (along with STOC and SODA) has been the birthplace for many popular data structures and efficient algorithms, with far-reaching applications. Many researchers and engineers at Google are trained in these research communities, and apply these techniques whenever possible. Google researchers will continue to contribute and learn from these conferences.

quinta-feira, 12 de novembro de 2009

A 2x Faster Web



Cross-posted with the Chromium Blog.

Today we'd like to share with the web community information about SPDY, pronounced "SPeeDY", an early-stage research project that is part of our effort to make the web faster. SPDY is at its core an application-layer protocol for transporting content over the web. It is designed specifically for minimizing latency through features such as multiplexed streams, request prioritization and HTTP header compression.

We started working on SPDY while exploring ways to optimize the way browsers and servers communicate. Today, web clients and servers speak HTTP. HTTP is an elegantly simple protocol that emerged as a web standard in 1996 after a series of experiments. HTTP has served the web incredibly well. We want to continue building on the web's tradition of experimentation and optimization, to further support the evolution of websites and browsers. So over the last few months, a few of us here at Google have been experimenting with new ways for web browsers and servers to speak to each other, resulting in a prototype web server and Google Chrome client with SPDY support.

So far we have only tested SPDY in lab conditions. The initial results are very encouraging: when we download the top 25 websites over simulated home network connections, we see a significant improvement in performance - pages loaded up to 55% faster. There is still a lot of work we need to do to evaluate the performance of SPDY in real-world conditions. However, we believe that we have reached the stage where our small team could benefit from the active participation, feedback and assistance of the web community.

For those of you who would like to learn more and hopefully contribute to our experiment, we invite you to review our early stage documentation, look at our current code and provide feedback through the Chromium Google Group.

segunda-feira, 2 de novembro de 2009

Google Search by Voice Learns Mandarin Chinese



Google Search by Voice was released more than one year ago as a feature of Google Mobile App, our downloadable application for smartphones. Its performance has been improving consistently and it now understands not only US English, but also UK, Australian, and Indian-English accents. However, this is far from Google's goal to find information and make it easily accessible in any language.

So, almost one year ago a team of researchers and engineers at Google's offices in Bangalore, Beijing, Mountain View, and New York decided we had to fix this problem. Our next question was, which should be our first language to address beyond English? We could have chosen many languages. The decision wasn't easy, but once we looked carefully at demographics and internet populations the choice was clear--we decided to work on Mandarin.

Mandarin is a fascinating language. Over this year we have learned about the differences between traditional and simplified Chinese, tonal characteristics in Chinese, pinyin representations of Chinese characters, sandhi rules, the different accents and languages in China, unicode representations of Chinese character sets...the list goes on and on. It has been a fascinating journey. The conclusion of all this work is today's launch of Mandarin Voice Search, as a part of Google Mobile App for Nokia s60 phones. Google Mobile App places a Google search widget on your Nokia phone's home screen, allowing you to quickly search by voice or by typing.



This is a first version of Mandarin search by voice and it is rough around the edges. It might not work very well if you have a strong southern Chinese accent for example, but we will continue working to improve it. The more you use it, the more it will improve, so please use it and send us your comments. And stay tuned for more languages. We know a lot of people speak neither English nor Mandarin!

To try Mandarin search by voice, download the new version of Google Mobile App on your Nokia S60 phone by visiting m.google.com from your phone's browser.

segunda-feira, 31 de agosto de 2009

51 Languages in Google Translate



Are you using Google Translate to access the world's information? It can help you find and translate local restaurant and hotel reviews into your language when planning a vacation abroad, allow you to read the Spanish or French Editions of Google News, communicate with people who speak different languages using Google Translate chat bots, and more. We're constantly working to improve translation quality, so if you haven't tried it recently, you may be pleasantly surprised with what it can do now.

We're especially excited to announce that we've added 9 new languages to Google Translate: Afrikaans, Belarusian, Icelandic, Irish, Macedonian, Malay, Swahili, Welsh, and Yiddish, bringing the number of languages we support from 42 to 51. Since we can translate between any two of these languages, we offer translation for 2550 language pairs!

How do we decide which languages to add to Google Translate? Our goal is to provide automatic translation for as many languages as possible. So internally we've been collecting data and building systems for more than 100 languages. Whenever a set of languages meets our quality bar we consider it for our next language launch. We've found that one of the most important factors in adding new languages to our system is the ability to find large amounts of translated documents from which our system automatically learns how to translate. As a result, the set of languages that we've been able to develop is more closely tied to the size of the web presence of a language and less to the number of speakers of the language.

We're very happy that our technology allows us to produce machine translation systems for languages that often don't get the attention they deserve. For many of the newly supported languages ours is the only mature and freely available translation system. While translation quality in these languages will be noticeably rougher than for languages we've supported for a longer time like French or Spanish, it is most often good enough to give a basic understanding of the text, and you can be sure that the quality will get better over time.

Remember, you can also use Google Translate from inside other Google products. For example you can translate e-mails within GMail, translate web pages using Google Toolbar, translate RSS news feeds from around the world in Google Reader, and translate documents in Google Docs. (The new languages aren't available in these products yet but will be soon!) And, if you're translating content into other languages, you can use our technology within Google Translator Toolkit to help you translate faster and better. In the future, expect to find our translation technology in more places, making it increasingly simple to get access to information no matter what language it is written in.

segunda-feira, 17 de agosto de 2009

On the predictability of Search Trends



Since launching Google Trends and Google Insights for Search, we've been providing daily insight into what the world is searching for. An understanding of search trends can be useful for advertisers, marketers, economists, scholars, and anyone else interested in knowing more about their world and what's currently top-of-mind.

As many have observed, the trends of some search queries are quite seasonal and have repeated patterns. See, for instance, the search trends for the query "ski" hit their peak during the winter seasons in the US and Australia. The search trends for basketball correlate with annual league events, and are consistent year-over-year. When looking at trends of the aggregated volume of search queries related to particular categories, one can also observe regular patterns in some categories like Food & Drink or Automotive. Such trends sequences appear quite predictable, and one would naturally expect the patterns of previous years to repeat looking forward.

On the other hand, for many other search queries and categories, the trends are quite irregular and hard to predict. Examples include the search trends for obama, twitter, android, or global warming, and the trend of aggregate searches in the News & Current Events category.

Having predictable trends for a search query or for a group of queries could have interesting ramifications. One could forecast the trends into the future, and use it as a "best guess" for various business decisions such as budget planning, marketing campaigns and resource allocations. One could identify deviation from such forecasting and identify new factors that are influencing the search volume as demonstrated in Flu Trends.

We were therefore interested in the following questions:
  • How many search queries have trends that are predictable?
  • Are some categories more predictable than others? How is the distribution of predictable trends between the various categories?
  • How predictable are the trends of aggregated search queries for different categories? Which categories are more predictable and which are less so?
To learn about the predictability of search trends, and so as to overcome our basic limitation of not knowing what the future will entail, we characterize the predictability of a Trends series based on its historical performance. In other words, we estimate the a posteriori predictability of a sequence determined by the error of forecasted trends vs the actual performance.

Specifically, we have used a simple forecasting model that learns basic seasonality and general trend. For each trends sequence of interest, we take a point in time, t, which is about a year back, compute a one year forecasting for t based on historical data available at time t, and compare it to the actual trends sequence that occurs since time t. The error between the forecasting trends and the actual trends characterizes the predictability level of a sequence, and when the error is smaller than a pre-defined threshold, we denote the trends query as predictable.

Our work to date is summarized in a paper called On the Predictability of Search Trends which includes the following observations:
  • Over half of the most popular Google search queries are predictable in a 12 month ahead forecast, with a mean absolute prediction error of about 12%.
  • Nearly half of the most popular queries are not predictable (with respect to the model we have used).
  • Some categories have particularly high fraction of predictable queries; for instance, Health (74%), Food & Drink (67%) and Travel (65%).
  • Some categories have particularly low fraction of predictable queries; for instance, Entertainment (35%) and Social Networks & Online Communities (27%).
  • The trends of aggregated queries per categories are much more predictable: 88% of the aggregated category search trends of over 600 categories in Insights for Search are predictable, with a mean absolute prediction error of of less than 6%.
  • There is a clear association between the existence of seasonality patterns and higher predictability, as well as an association between high levels of outliers and lower predictability. For the Entertainment category that has typically less seasonal search behavior as well as relatively higher number of singular spikes of interest, we have seen a predictability of 35%, where as the category of Travel with a very seasonal behavior and lower tendency for short spikes of interest had a predictability of 65%.
  • One should expect the actual search trends to deviate from forecast for many predictable queries, due to possible events and dynamic circumstances.
  • We show the forecasting vs actual for trends of a few categories, including some that were used recently for predicting the present of various economic indicators. This demonstrates how forecasting can serve as a good baseline for identifying interesting deviations in actual search traffic.
As we see that many of the search trends are predictable, we are introducing today a new forecasting feature in Insights for Search, along with a new version of the product. The forecasting feature is applied to queries which are identified as predictable (see, for instance, basketball or the trends in the Automotive category) and then shown as an extrapolation of the historical trends and search patterns.

There are many more questions that can be looked at regarding search trends in general, and their predictability in particular, including design and testing more advanced forecasting models, getting other insights into the distributions of sequences, and demonstrating interesting deviations of actual-vs-forecast for predictable trends series. We'd love to hear from you - share with us your findings, published results or insights - email us at insightsforsearch@google.com.

terça-feira, 11 de agosto de 2009

Under the Hood of App Inventor for Android



We recently announced our App Inventor for Android project on the Google Research Blog. That blog entry was long on vision but short on technological details--details which we think would be of interest to our readers.

Of particular interest is our use of Scheme. Part of our development environment is a visual programming language similar to Scratch. The visual language provides a drag-and-drop interface for assembling procedures and event handlers that manipulate high-level components of Android-based phones. The components are similar to the ones in the recently announced Simple; in fact, the code bases share an ancestor.

We parse the visual programming language into an S-expression intermediate language, which is a domain-specific language expressed as a set of Scheme macros, along with a Scheme runtime library. We did this for a few reasons:
  • S-expressions are easy to generate and read for both humans and machines.
  • Scheme macros are a convenient (albeit sometimes arcane) way to express S-expression based syntax.
  • Scheme is a small, powerful and elegant language well suited to describe and evaluate a large set of programming semantics. Additionally, it provides the flexibility that we require as our language and its semantics grow and develop.
  • Scheme expertise was readily available among our team.
  • A pre-existing tool (Kawa by Per Bothner) to create Android compatible output from scheme code was already available.
For now the project is just an experiment we're performing with a dozen colleges and universities, but we hope to eventually open up the development environment to wider use and to open-source parts of the code.

segunda-feira, 3 de agosto de 2009

Two Views from the 2009 Google Faculty Summit



[cross-posted with the Official Google Blog]

We held our fifth Computer Science Faculty Summit at our Mountain View campus last week. About 100 faculty attendees from schools in the Western hemisphere attended the summit, which focused on a collection of technologies that serve to connect and empower people. Included in the agenda were presentations on technologies for automated translation of human language, voice recognition, responding to crises, power monitoring and collaborative data management. We also talked about technologies to make personal systems more secure, and how to teach programming — even using Android phones. You can see a more complete list of the topics in the Faculty Summit Agenda or check out my introductory presentation for more information.

I asked a few of the faculty to provide us their perspective on the summit, thinking their views may be more valuable than our own: Professor Deborah Estrin, a Professor of Computer Science at UCLA and an expert in large-scale sensing of environmental and other information, and Professor John Ousterhout, an expert in distributed operating systems and scripting languages.

Professor Estrin's perspective:

We all know that Google has produced a spectacular array of technologies and services that has changed the way we create, access, manage, share and curate information. A very broad range of people samples and experiences Google’s enhancements and new services on a daily basis. I, of course, am one of those minions, but last week I had the special opportunity to get a glimpse inside the hive while attending the 2009 Google Faculty Summit. I still haven't processed all of the impressions, facts, figures and URLs that I jotted down over the packed day and a half-long gathering, but here are a few of the things that impressed me most:

  • The way Google simultaneously launches production services while making great advances in really hard technical areas such as machine translation and voice search, and how these two threads are fully intertwined and feed off of one another.
  • Their embrace of open source activities, particularly in the Android operating system and programming environment for mobiles. They also seed and sponsor all sorts of creative works, from K-12 computer science learning opportunities to an the open data kit that supports data-gathering projects worldwide.
  • The company’s commitment to thinking big and supporting their employees in acting on their concerns and cares in the larger geopolitical sphere. From the creation of Flu Trends to the support of a new "Crisis Response Hackathon" (an event that Google, Microsoft and Yahoo are planning to jointly sponsor to help programmers find opportunities to use their technical skills to solve societal problems), Googlers are not just encouraged to donate dollars to important causes — they are encouraged to use their technical skills to create new solutions and tools to address the world's all-too-many challenges.

This was my second Google Faculty Summit — I previously attended in 2007. I was impressed by the 2007 Summit, but not as deeply as I was this year. Among other things, this year I felt that Googlers talked to us like colleagues instead of just visitors. The conversations flowed: Not once did I run up across the "Sorry, can't talk about that... you know our policy on early announcements". I left quite excited about Google's expanded role in the CS research ecosystem. Thanks for changing that API!

Professor Ousterhout's perspective:

I spent Thursday and Friday this week at Google for their annual Faculty Summit. After listening to descriptions of several Google projects and talking with Googlers and the other faculty attendees, I left with two overall takeaways. First, it's becoming clear that information at
scale is changing science and engineering. If you have access to enormous datasets, it opens up whole new avenues for scientific discovery and for solving problems. For example, Google's machine translation tools take advantage of "parallel texts": documents that have been translated by humans from one language to another, with both forms available. By comparing the sentences from enormous numbers of parallel texts, machine translation tools can develop effective translation tools using simple probabilistic approaches. The results are better than any previous attempts at computerized translation, but only if there are billions of words available in parallel texts. Another example of using large-scale information is Flu Trends, which tracks the spread of flu by counting the frequency of certain search terms in Google's search engine; the data is surprisingly accurate and available more quickly than that from traditional approaches.

My second takeaway is that it's crucial to keep as much information as possible publicly available. It used to be that much of science and engineering was driven by technology: whoever had the biggest particle accelerator or the fastest computer had an advantage. From now on, information will be just as important as technology: whoever has access to the most information will make the most discoveries and create the most exciting new products. If we want to maintain the leadership position of the U.S., we must find ways to make as much information as possible freely available. There will always be vested commercial interests that want to restrict access to information, but we must fight these interests. The overall benefit to society of publishing information outweighs the benefit to individual companies from restricting it.

sexta-feira, 31 de julho de 2009

App Inventor for Android



At Google Research, we are making it easy to build mobile applications, and we're collaborating with faculty from a dozen colleges and universities to explore whether this could change the nature of introductory computing. With the support of Google University Relations, the faculty group will work together this fall to pilot courses where beginning students, including non-computer science majors, create Android applications that incorporate social networking, location awareness, and Web-based data collections.

Mobile applications are triggering a fundamental shift in the way people experience computing and use mobile phones. Ten years ago, people "went to the computer" to perform tasks and access the Internet, and they used a cell phone only to make calls. Today, smartphones let us carry computing with us, have become central to servicing our communication and information needs, and have made the web part of all that we do. Ten years ago, people's use of computing was largely dissociated from real life. With the ubiquity of social networking, online and offline life are becoming fused. This fall's exploration is motivated by the vision that open mobile platforms like Android can bring some of that same change to introductory Computer Science, to make it more about people and their interactions with others and with the world around them. It's a vision where young people—and everyone—can engage the world of mobile services and applications as creators, not just consumers. Through this work, we hope to do the following:

  • Make mobile application development accessible to anyone.
  • Enhance introductory learning experiences in computing through the vehicle of Android’s open platform.
  • Encourage a community of faculty and students to share material and ideas for teaching and exploring.

The collaborative experiment kicked off with a three-day workshop at Google's Mountain View campus in June, where invited faculty shared their plans for the courses they will offer this fall. The group also got an advance look at App Inventor for Android, the prototype development platform that Google is working on and that the faculty and their students will use in their courses. App Inventor for Android lets people assemble Android applications by arranging "components" using a graphical drag-and-drop-interface. One of the goals of the fall experiment is to further shape the system in response to the experience and feedback of students and faculty.

The schools participating in this fall's collaboration are Ball State University, University of Colorado Boulder, Georgia Tech, Harvard, Indiana University, Mills College, MIT, Olin College, University of California at Berkeley, University of Michigan, University of Queensland, University of San Francisco, and Wellesley College.

Questions or comments? Please send us feedback. We look forward to hearing from you!

quarta-feira, 22 de julho de 2009

Predicting Initial Claims for Unemployment Benefits



One of the strongest leading indicators of economic activity is the number of people who file for unemployment benefits. Macroeconomists Robert Gordon and James Hamilton have recently examined the historical evidence. According to Hamilton's summary: "...in each of the last six recessions, the recovery began within 8 weeks of the peak in new unemployment claims."

In an earlier blog post, we suggested that Google Trends/Search Insights data could be useful in short term predictions of economic variables. Given the importance of initial claims as a macroeconomic predictor, we thought it would be useful to try to forecast this economic metric. The initial claims data is available from the Department of Labor, while the Google Trends data for relevant categories is available here.

We applied the methodology outlined in our earlier paper, building a model to forecast initial claims using the past values of the time series, and then added the Google Trends variables to see how much they improved the forecast. We found a 15.74% reduction in mean absolute error for one-week ahead out, of sample forecasts. Most economists would consider this to be a significant boost. Details of our analysis may be found in this paper.

The bottom line is that initial claims have been generally declining from their peak and that, so far at least, the Google query data is forecasting further short term declines. It would be good news indeed if this particular Google trend continues.

terça-feira, 21 de julho de 2009

ACM EC Conference and Workshop on Ad Auctions



This month, the 10th ACM Conference on Electronic Commerce (EC 2009) and the 5th Workshop on Ad Auctions took place at Stanford University. This is one of the major forums for economists and computer scientists to share their ideas about mechanism design and algorithmic game theory. Other than co-authoring several papers in the conference and workshops, Google contributed significantly in presenting tutorials.

Among the four tutorials given at the ACM EC conference, we participated in presenting two of them:
  • In a joint tutorial with Google researcher Muthu Muthukrishnan, we explored research problems in sponsored search inspired by taking the advertiser's perspective. Emphasizing a cross-disciplinary approach, we presented sample research directions in keyword selection, traffic prediction and bidding strategy, encouraging the research community to build upon known auction models in order to tackle these even more challenging domains. We explored in more detail specific examples of research in bidding strategies.
  • Moreover, in a joint tutorial with H. Roeglin, we presented results in convergence of game dynamics, both to equilibria and nearly-optimal solutions. This was a more algorithm-oriented variant of the tutorial at ICML (which is described in a previous blog post.)

The Ad Auctions Workshop brought together many industry and academic research leaders to discuss ongoing challenges in online advertisement. The topics presented at the workshop included the role of externalities in ad auctions, new truthful ad auction mechanisms with budget constraints, efficiency loss of generalized second-price ad auctions, and complex combinatorial ad auctions. Google researchers co-organized, participated in the discussions, and contributed the following presentations:
  • Google's chief economist, Hal Varian, gave an enlightening invited talk about using Google Trends data for "predicting the present." In this work, Google Trends data is used to help improve forecasts of various economic time series. Examples illustrating this technique were drawn from the auto industry, real estate, and unemployment. He emphasized that Google Trends data is publicly available and encouraged people to use this data for their research.
  • We presented two papers in the workshop, one about optimal pricing mechanisms over social networks and one about using offline optimization in stochastic online ad allocation problems. In the latter talk, we presented an algorithm in which we use an optimal offline solution in an "expected instance", and use this solution as a signal in online decision making. Using the idea of the power of two choices from the CS literature, we give a novel theoretical analysis of our method, improving the best previously known result. We also gave some practical insight about using these methods in online ad allocation. The theoretical results will appear in the upcoming FOCS 2009 conference.

Motivated by our various ad systems, there is a large research effort at Google around areas at the intersection of Economics, Computer Science and Machine Learning. The Ad Auctions Workshop and ACM Conference on Electronic Commerce are among the best forums for stimulating ideas and collaboration in these interdisciplinary areas.

terça-feira, 14 de julho de 2009

Google's Research Awards Program Update



When we think about innovation, it is easy to forget that it took about 55 years to spread automobile usage to 1/4 of the US population, ... 35 years for the telephone, ... 20 years for the radio, ... 15 years for the PC, ... 10 years for the cell phone, ... 7 years for the Internet (Council of Competiveness, Innovate America, 2004).

Recognizing that innovation holds the key to many of the unique technical challenges we face, we remain committed to maintaining strong relations with the academic research community. Our Research Awards Program has experienced phenomenal growth. Of the total number of applications that we received since the program's inception in 2005, more than half were submitted in the past year. To cope with the increased level of interest worldwide, we reorganized the program to accept submissions three times per year: April 15th, August 15th, and December 15th. Proposals are evaluated by teams of engineers and researchers, who make recommendations for funding. We try to move fast. Investigators receive a response about three months after their submission.

Here are some highlights from a recent round of applications:

"Recognition and Modeling of Objects from Street View Scans of Cities"
Thomas Funkhouser, Princeton University
Professor Funkhouser aims to develop methods for automatic construction of semantically-labeled, detailed, and photorealistic 3D models of cities from Street View data. The main efforts will be towards the segmentation and recognition of small objects (e.g. mailboxes, fire hydrants, parking meters, etc.) in Lidar data based on shape classification and contextual reasoning. A second objective will be to construct seamless, photorealistic 3D models of complete cities by extracting and fitting parts from repositories of polygonal models.

"An Ad Auctions Trading Agent Competition"
Michael Wellman, University of Michigan
The University of Michigan will introduce and operate a new game in the Trading Agent Competition (TAC) series of research competitions, in the domain of sponsored search. The TAC Ad Auctions (TAC/AA) game challenges participants to develop bidding strategies for advertisers in a simulated retail home entertainment market. The aim is to spur research and generate insights about advertiser bidding strategy, in a scenario more complex than those considered in the research literature to date. The TAC/AA environment features multiple interrelated keywords, a structured search user model, rich data availability, and a dynamic market context. Since 2000, the annual TAC series has catalyzed research on trading agent design and analysis, produced by a diverse group of researchers from academia and industry.

"A Suite of Automated Tools for Efficient Management and Search in Web-based Arabic Documents"
Adnan Yahya, Birzeit University, Palestine.
This research aims to design text mining and processing tools that are able to efficiently index, process, search, and categorize large quantities of Arabic data. This research addresses the challenges Arabic poses for NLP and information retrieval, automatic Arabic document categorization, root extraction, language detection, and Arabic query correction, suggestion and expansion. The PIs employ a statistical/Corpus-based approach based on contemporary data initially obtained from a local newspaper.

"When Children Search: Understanding what they do and what they could do with Google Search"
Allison Druin, University of Maryland
Children ages 5-13 are among the most frequent users of the Internet; yet, searching and browsing the web can present many challenges. Spelling, typing, query formulation, and deciphering results are all barriers for children in attempting to find the information they need. Professor Druin is trying to understand these issues in more diverse ages of children by focusing on current and ubiquitous search tools, namely, keyword-based web search engines.

"An Educational Camera for Kids"
Shree Nayar, Columbia University
Professor Nayar is desiging a novel digital camera that could be used as an innovative educational medium. His target audience is students between the ages of 10 and 13 years living in poor communities across the globe. The camera, named “Bigshot,” will be presented to students as a kit to expose them to diverse science and engineering concepts. Once assembled, the camera will be used so that students can share their photos with students in other cultures using Picasa and Google Groups.

"Discovering semantic concepts and their relations in large image collections"
Bernt Schiele, Technische Unversitat Darmstadt, Germany
Professor Schiele will investigate to what extent meaningful structures can be discovered from large sets of images both in a fully unsupervised fashion as well with minimal human supervision. To this end, Professor Schiele's work will try to first discover structure and learn multi-feature distance metrics in large collections of images and then to enrich such structure by weak annotations in order to link discovered structure and to derive semantic concepts.

"Dataspace Metrics: Measuring Progress for Pay-as-you-go Information Integration"
Michael Franklin, University of California
The goal of this project is to develop a measurement framework for gauging progress in terms of the quality and accuracy of information integration. The starting point is the development of a set of metrics for judging the “goodness’ of information integration across a number of information types and use cases. These metrics will then be analyzed and where possible, unified, so that a more general measurement framework can be developed. Such a framework will serve as a key component for future Dataspace management systems, and could provide a grounding for other collaborative information integration solutions.

For more information about this program, including submission guidelines, please visit the Research Awards Program page.

quinta-feira, 2 de julho de 2009

International Conference on Machine Learning (ICML 2009) in Montreal



The 26th International Conference on Machine Learning (ICML 2009) was recently held in Montreal in conjunction with the 22nd Conference On Learning Theory (COLT 2009) and the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009). This is one of the major forums for researchers from both industry and academia to share the recent developments in the area of machine learning and artificial intelligence. Machine learning is a central area for Google as it has many applications in extracting useful information from a vast amount of data available on the web. In addition to sponsoring this scientific event, Google contributed intellectually to several scientific forums. Here's a short report of those activities:
  • There were ten papers co-authored by Googlers in these conferences, which covered several areas of machine learning including domain adaption, online learning, bandits, boosting, sparsity and kernel learning.
  • Corinna Cortes, the head of Google Research NY gave one of the three invited talks of ICML. She surveyed the last decade of research in learning kernels and highlighted both the successes and the failures in learning kernels with a focus on applications of convex optimization for this purpose. Corinna concluded with a call for applying new ideas and novel techniques to overcome the current obstacles.
  • We presented a tutorial on Convergence of Natural Game Dynamics. This topic has received a lot of attention recently as it stands at the conflux of many fields such as economics, machine learning and theoretical computer science. In the tutorial, we surveyed the convergence properties of the most natural game dynamics such as the Nash dynamics or the best-response dynamics to the popular no-regret learning-based dynamics. The tutorial highlighted similarities and differences between the approaches in both the time of convergence, the point of convergence, and the quality of the outcome. We believe that the influence of the learning algorithms on the behavior of the users is an exciting and intriguing topic of research for many, and in particular for the analysis of ad auctions.
Google's main mission is "to organize the world's information and make it universally accessible and useful," and machine learning plays a fundamental role in both of these aspects. As a result, Google has invested significant resources in this area of research, and we look forward to continued participation and collaboration at these conferences for many more years.

terça-feira, 23 de junho de 2009

Speed Matters



At Google, we've gathered hard data to reinforce our intuition that "speed matters" on the Internet. Google runs experiments on the search results page to understand and improve the search experience. Recently, we conducted some experiments to determine how users react when web search takes longer. We've always viewed speed as a competitive advantage, so this research is important to understand the trade-off between speed and other features we might introduce. We wanted to share this information with the public because we hope it will give others greater insight into how important speed can be.

Speed as perceived by the end user is driven by multiple factors, including how fast results are returned and how long it takes a browser to display the content. Our experiments injected server-side delay to model one of these factors: extending the processing time before and during the time that the results are transmitted to the browser. In other words, we purposefully slowed the delivery of search results to our users to see how they might respond.

All other things being equal, more usage, as measured by number of searches, reflects more satisfied users. Our experiments demonstrate that slowing down the search results page by 100 to 400 milliseconds has a measurable impact on the number of searches per user of -0.2% to -0.6% (averaged over four or six weeks depending on the experiment). That's 0.2% to 0.6% fewer searches for changes under half a second!

Furthermore, users do fewer and fewer searches the longer they are exposed to the experiment. Users exposed to a 200 ms delay since the beginning of the experiment did 0.22% fewer searches during the first three weeks, but 0.36% fewer searches during the second three weeks. Similarly, users exposed to a 400 ms delay since the beginning of the experiment did 0.44% fewer searches during the first three weeks, but 0.76% fewer searches during the second three weeks. Even if the page returns to the faster state, users who saw the longer delay take time to return to their previous usage level. Users exposed to the 400 ms delay for six weeks did 0.21% fewer searches on average during the five week period after we stopped injecting the delay.

While these numbers may seem small, a daily impact of 0.5% is of real consequence at the scale of Google web search, or indeed at the scale of most Internet sites. Because the cost of slower performance increases over time and persists, we encourage site designers to think twice about adding a feature that hurts performance if the benefit of the feature is unproven. To learn more on how to improve the performance of your website visit code.google.com/speed. For more details on our experiments, download this PDF.