The document outlines an approach to scalable network services in Java using event-driven and non-blocking I/O. It discusses using the reactor pattern to handle I/O events asynchronously by dispatching tasks to handlers. This allows for high performance by reducing blocking and leveraging available resources like CPUs. It provides examples of how this can be implemented using Java's NIO APIs including channels, buffers, selectors and selection keys.
This document summarizes Hadoop MapReduce, including its goals of distribution and reliability. It describes the roles of mappers, reducers, and other system components like the JobTracker and TaskTracker. Mappers take input key-value pairs and produce intermediate output. Reducers receive intermediate keys and values to produce the final output. The JobTracker manages jobs and scheduling while the TaskTracker manages tasks on each node.
This document provides an overview and introduction to Distributed Ruby (DRb), which enables remote method invocation for Ruby. DRb allows peer-to-peer communication between client and server processes, works across multiple platforms and protocols, and can be used for parallelism, distributing work, and interprocess communication. Examples are provided demonstrating basic DRb usage including a server hosting a shared object and a client accessing it. Details are also given on connection types, security features, and related technologies like Rinda that build on DRb.
This document summarizes multi-core computer architectures. It discusses how single-core CPUs are being replaced by multi-core chips that contain multiple processor cores on a single die. Each core can run threads in parallel for improved performance. The cores share the same memory and socket. Operating systems see each core as a separate processor. Issues around cache coherence and programming for multi-core architectures are also covered at a high level.
The document provides background information on author Aldous Huxley and his most famous work, Brave New World. It summarizes that Huxley wrote Brave New World in 1932 as a dystopian novel and political satire set in a future society that strives for stability and happiness through genetic engineering and conditioning. Some of the key characters described include Bernard Marx, John the Savage, Lenina Crowne, and Mustapha Mond.
1. The document discusses the problem of social networks not being portable and the need to leverage the existing "social graph" of connections between users across different sites.
2. The author proposes a system that would allow a user's friends to follow them across different social networks and track new connections over time without having to repeatedly invite friends.
3. The system would map the relationships between a user's accounts across multiple sites to find equivalent nodes and all of a user's aggregate friends in order to suggest new connections.
The document discusses the evolution of web applications from vanilla to AJAX and strategies for building a model of user behavior in AJAX applications. It recommends building a na?ve initial model, validating and refining it using metrics like responsiveness and efficiency. The key is to pre-fetch data only if the value of reduced latency outweighs the cost of downloading data that may not be needed.
Closures for Java and other thoughts on language evolution
The document discusses goals for language changes including simplifying programs, reducing bugs, and adapting to changing requirements like multicore processors and concurrency. It proposes adding closures to Java to help meet these goals by allowing for more concise code, avoiding repetition, and making programming with concurrency easier. Specific examples are given of how closures could help implement common patterns like try-with-resources and iteration in a more readable way while also enabling flexibility for programmers to extend the language.
A Content-Driven Reputation System for the Wikipedianextlib
?
This document discusses a content-driven reputation and text trust system for Wikipedia. It aims to encourage lasting contributions by having authors gain reputation for edits that survive over time and lose reputation for reverted edits. Text trust is computed based on the reputation of its authors, with new text initially having lower trust that increases as it survives edits from higher-reputation authors. The system was tested on entire Wikipedia language editions and showed predictive power, with low-reputation edits more likely to be reverted.
1) Matrix multiplication (Ax=b) can be viewed as a change of axes that transforms the input vector x into the output vector b.
2) Singular value decomposition (SVD) finds the orthogonal input and output basis vectors (columns of U and V matrices) and the scaling factors between them (diagonal elements of S matrix).
3) SVD(A) = USV^T provides a way to understand how a matrix stretches and transforms space by decomposing it into orthogonal basis vectors and scale factors linking the input and output spaces.
This document discusses Mongrel, an HTTP server library written in Ruby that provides a fast and flexible way to run web applications. It summarizes Mongrel's key features and compares its performance to Rails. It also discusses how Merb was created as a cleaner implementation of Rails that takes advantage of Mongrel's speed by running as a Mongrel handler rather than a standalone framework.
The document compares different state-of-the-art collaborative filtering systems. It finds that item-based collaborative filtering performs best with a mean absolute error of 0.6382 using probabilistic similarity and 400 neighbors. User-based approaches work best with 1500 neighbors and predicting using deviation from the mean. Cluster-based approaches have the highest error rate of 0.6736 using K-means clustering into 4 clusters and predicting with Bayes. Item-based approaches require fewer neighbors and scale better to large datasets.
Item Based Collaborative Filtering Recommendation Algorithmsnextlib
?
The document summarizes research on item-based collaborative filtering recommendation algorithms. It analyzes techniques for computing item-item similarities and generating recommendations from the similarities. Experimental results show that item-based collaborative filtering provides better quality recommendations than user-based approaches, especially for sparse datasets. The regression-based prediction computation technique outperforms the weighted sum approach.
The survey received 781 responses from IT professionals regarding their experience with agile techniques. Key findings include:
- 85% of organizations had adopted at least one agile technique
- When asked about future adoption, most said within a year or were considering it
- Pair programming and test-driven development were among the most commonly adopted techniques
- Over 70% of co-located agile projects were considered successful
The document provides an overview of the Cool programming language and the compiler project. It discusses the main components of a compiler including the frontend, intermediate representation, and backend. It describes Cool's features like classes, methods, inheritance and memory management through garbage collection. The project involves implementing a complete compiler for Cool that translates programs to MIPS assembly in C++ across multiple assignments.
Improving Quality of Search Results Clustering with Approximate Matrix Factor...nextlib
?
This document discusses improving search results by clustering them into semantic groups. It describes some limitations of conventional ranked search results, such as users having to sift through many irrelevant documents. The document proposes clustering search results to group related documents together under meaningful labels to give users a better overview. It describes how search results clustering works using only short document snippets as input. Matrix factorizations are discussed as a method to identify good cluster labels from the snippets by decomposing a term-document matrix.
This document summarizes support vector machines (SVMs), a machine learning technique for classification and regression. SVMs find the optimal separating hyperplane that maximizes the margin between positive and negative examples in the training data. This is achieved by solving a convex optimization problem that minimizes a quadratic function under linear constraints. SVMs can perform non-linear classification by implicitly mapping inputs into a higher-dimensional feature space using kernel functions. They have applications in areas like text categorization due to their ability to handle high-dimensional sparse data.
The document summarizes Google's Bigtable storage system, which provides a structured storage layer for large distributed data sets. Bigtable stores data as a sparse, distributed, multidimensional sorted map. It is built using the Google File System for storage, Chubby for locking, and provides a simple "get/put/delete" interface for accessing rows and columns. Bigtable tables are sharded into tablets, distributed across servers, and data is stored in immutable Sorted String Tables (SSTables).