Griffith's picture

Amazon SimpleDB is another service from Amazon that uses its Dynamo technology. With SimpleDB, Amazon has at last incorporated database as part of the company's web services. SimpleDB is currently available to the public as a beta service, with several technical limitations, including:
A single query will timeout after 5 seconds.
Only strings are available as data type.
Data query, write, retrieval are type casted into strings.
Strings cannot have more than 1,024 characters.
An item can have a maximum of 256 attributes.
In open beta testing, a SimpleDB domain is capped at 10GB capacity.

SimpleDB is not RDBMS (relational database management system), but it rather operates in a far simpler fashion. Amazon SimpleDB's data are stored as Domain → PKeys, PKeys → Attributes, and within each attribute, Key → Value. For instance:

Category: Company
Name: Cellopoint

Being a database, SimpleDB comes with its own querying API.  read more »

Amazing Graphical Scripting Language

Shawn Lin's picture

Sikuli is a visual technology to automate and test graphical user interfaces (GUI) using images (screenshots). Sikuli includes Sikuli Script, a visual scripting API for Python, and Sikuli IDE, an integrated development environment for writing visual scripts with screenshots easily. Sikuli Script automates anything you see on the screen without internal API's support. You can programmatically control a web page, a desktop application running on Windows/Linux/Mac OS X, or even an iPhone application running in an emulator.

Sikuli, which read much like a Japanese name, in fact, is an innovative programming language, by a student at MIT (students from Taiwan) and his friends took more than three years to research and generated products.

It is a new concept, the use of image recognition, to the effect of automation of many complex instructions.

As Vgod said:“ The most important revolution of Sikuli is code readability and ease to use. Screenshot directly on the code inside, people can directly ‘see’ what he wants to control, which no one ever thought about. Previously, only programmers were able to write programs using the mysterious alien languages.”

From the automated tools point of view, "SIKULI" is not so unique, but it is unique in the method it uses. We know that programming languages are fairly mature technology tools and thus programmers are used to the idea that the languages are difficult to use or have been hypnotized himself to say such as ". NET has been more useful! "," wow! DELPHI has added a super useful component of the Windows API, "this type of dialogue.

No one ever overturn the concept of the past before the advent of "SIKULI", that came up with a new programming language and creative ways. "SIKULI" really achieved innovation. Screenshot replaced with objects, you do not have to know the Windows API libraries, which can control the window components. Although it has not developed to the concept where one can write stand-alone applications, it can be used as a desktop automation tool. but it does point out a way to tell programmers around the world, "A new program design way to go!"

First, of course, you must download "SIKULI", and installed Java Runtime Environment (JRE) environment in the computer. You can follow the method to easily customize the operation!

Enter Ext JS: The Best of JavaScript Libraries

Paul Chien's picture

A long time ago in a galaxy far, far away (more precisely, early 2006, the planet Earth), a gentleman by the name of Jack Slocum developed a set of extension utilities to the YUI library. These utilities rapidly gained popularity within the YUI community and were quickly organized into an independent library called YUI-Ext. In fall 2006, Jack released the .33 version of this new library under the terms of the Berkeley Software Distribution (BSD) license.

After a while, before 2006 was over in fact, the name of the library was changed to Ext, because it was starting to develop independently of YUI at that point. In fact, support for other libraries was beginning to be developed within Ext.

In 2007, Jack formed a company to further advance the development of Ext, which at some point thereafter began to be known as Ext JS. On April 1, 2007, Ext JS 1.0 was released.

In a short period of time, Ext JS evolved from a handy set of extensions to a popular library into what many people, including yours truly, feel is the most mature JavaScript UI development library available today.

Ext JS is focused on allowing you to create great user interfaces in a web app. In fact, it is best known for its top-notch collection of UI widgets. It allows you to create web apps that mimic the look and feel of desktop native applications, and it also allows you to mimic most of the functionality those applications provide. In short, Ext JS enables you to build true RIAs.

It’s not all about widgets, though, as we’ll see. There are a lot of useful utility-type functions in Ext JS as well. Need some Ajax functionality? Check. Need some string manipulation functions? It’s got those too. Need a data abstraction layer? Ext JS has you covered.

[1] Frank W. Zammetti (2009). Practical Ext JS Projects with Gears

A Search Engine for Your Personal Cloud

June Huang's picture

Accessing the myriad of information on the Web has been made possible with web search engines. Nowadays, cloud technology changes the way people interact with the Web, for example social networking and data storage. Our personal cloud grows and keeping track of what goes on and where things occur becomes a relevant issue. Consider all the information that you and your social networks create in one day. E-mails, calendar events and conversations are all such examples of your social streams and remembering everything that happens is impossible. How can we effectively find things in our own cloud? The answer is: a search engine for your personal cloud.

Greplin and Introspectr are two such services that allow users to filter through their personal data. They offer indexing of your social-networking services like Facebook and Twitter, mailboxes like Gmail (including attachments and links) and even file-sharing services such as Dropbox and Google Docs. To use their services, simply type in your query and they will return all the occurrences of your search regardless of which streams they appeared from.

The main difference between Greplin and Introspectr is that Greplin offers real-time indexing approximately every 20 minutes. With Instrospectr, you will have to update the index manually. There is also a known issue where Greplin does not index contents of external URLs from tweets where as Introspectr does [2]. Both Greplin and Introspectr allows you to index a variety of services, however, log-in information is required to specify a service you want indexed. It is obvious that safety and privacy becomes a concern. Greplin states that they use OAuth to retrieve only the data and they do not have access to your log-in information [3].

Greplin and Introspectr offer a convenient and centralized way for users to filter through the contents of their cloud. Their services can be accessed on almost all devices with an Internet connection and searching through social feeds become just as easy as an e-mail or hard-drive search.

[1] Arrington, M (Aug 31, 2010). The Other Half Of Search: Greplin Is A Personal Search Engine For Your Online Life. Retrieved on October 26, 2010, from
[2] Schonfeld, E. (Oct 12, 2010). Introspectr Searches Your Social Streams. Retrieved on October 26, 2010, from
[3] Greplin:
[4] Introspectr:

Google Omaha

Griffith's picture

Google client products are capable of updating themselves to a newer version without end-user intervention. This is known as “auto-updating.”

Most of Google client products possess the auto-update feature, but with different implementation. Some has their own auto-update solutions, while others utilize the common auto-updater code and the common auto-updater server.

Google sought for client products to minimize code duplication and avoid maintaining multiple servers that essentially perform identical functionality. As result they considered unifying all client products under a single auto-update solution. This decision was influenced by the fact the evolution of Microsoft Windows, which made this unification almost mandatory. Window Vista has a strict security model that restricts the ability of most applications on a machine to perform system-changing activities, including: modifying the Windows registry, writing to the Program Files directory, and in some cases writing any persistent change to the system at all. The update of an installed program requires all these modifications, which would lead traditional auto-updating to fail on Microsoft Vista.

The lack of shared code brought another issue: each auto-update implementation had its own subset of desired features. Many of them failed to have multiple update tracks, or define a consistent versioning mechanism. Code unification allowed Google to deploy a rich set of auto-update capabilities to all its client applications.

As the number of Google applications increased, the improvement of the overall install user experience became increasingly desirable. Traditionally, the browser would prompt the end-user with a series of technical and confusing dialogs which encouraged the user to abandon their installation. Then the user was led to a wizard filled with choices that they did not need to know or know how to decide amongst. These result in a bad user experience during the product installation process.

In order to meet all the requirements and challenges mentioned above, Google developed a shared client infrastructure that handles all installation and auto-updating tasks for Google Windows client products. This client communicated with a single Google auto-update server. This server, together with the client, is named Google Update. And the project is known as Omaha.

Server and Desktop Virtualization

Ricky Wu's picture

In Computing, virtualization is broad term that refers to the abstraction of computer resources. It hides the physical characteristics of computing resources from their users, be they application, or end users. This includes making a single physical resource appear to function as multiple virtual resources; It can also include making multiple physical resources appears as a single virtual resource…
In a recent discussion, the virtualization is widely applied to a number of concept including:

  • Server Virtualization
  • Desktop Virtualization
  • Network Virtualization
  • Storage Virtualization
  • Application Infrastructure Virtualization

In this article, we’ll focus on server and client Virtualization.

 read more »


RabbitMQ is a message queue server which provide Advanced Message Queuing Protocol (AMQP). It is implemented by Erlang! And it provide C, Python, PHP and Jave client API.
• Persistent message
• Transaction
• Virtual host
• Cluser

Message Producer:
It creates a message with the routing key and send it to exchange. The routing key is used to determine which Message Queue should be sent to.

It will accept the message and route them to Message Queue. A binding define the relation between an exchange and Message Queue.
There are three types of





Message Queue:
It holds messages and deliver them to message consumer.
Message Consumer:
It will do the work according to the incoming message.


Intro to MongoDB

Ricky Wu's picture

NoSQL is a kind of database that different to common relational database, does not provide SQL interface to manipulate data. NoSQL databases can be separate into 3 categories : column-oriented, key-value pairs and document oriented databases. The following database we’re going to introduce is belonging to document-oriented database.
Unlike most relational databases, document-oriented database do not store data in table with fix-sized fields for each record. Instead, each record is stored as a document no matter the number of fields is variable or any length of a field can be added to a document. Fields can be split into many pieces and become a nested format also.

$person = array(
    "name" => "Cesar Rodas",
    "country" => "Paraguay",
    "languages" => array("Spanish", "English", "Guarani"),

MongoDB is a scalable, high-performance, document-oriented database written in the C++ programming language. It is a open source software and free for use also. The goal of MongoDB is to combine pros. of key-value store and traditional RDBMS systems. SO it has character of fast and highly scalable of key-value store database and also has rich queries and deep functionality of RDBMS system.
MongoDB has following interesting characters in implementation :

  • It uses JSON, instead of XML
  • It is fast, as it is written in C++
  • Supports index definitions
  • Provides an easy to use query interface, very similar to some database abstraction layers
  • Supports operations with sub-documents
  • Provides a native PHP extension
  • Supports auto-sharding
  • Supports map-reduce for data transformation

Installation and configuration: MongoDB support is available in many mainstream programming languages such like PHP, Python, Ruby, JAVA, C++ etc. So it’s easy and convenient to use. MongoDB provides many pre-compiled distributions for any platform and there are also packages available for various package managers. And they also provides source code of MongoDB for advanced user to download.
Installing MongoDb is simple. First, go to MongoDB official website and download appropriate package according to your platform. Then:
mkdir -p /data/db tar -xvzf PACKAGE   ./mongodb-xxxxxxx/bin/mongod

After installation completed, then lunch MongoDB daemon to listen on port 27017. Default path of database is set to /data/db (c:\data\db in Windows). Now, we can manipulate database through mongo that is command-line shell client for mongoDB.
Developing scalable PHP applications using MongoDB - PHP Classes blog

Smartphone Security - Risks and Preventions

June Huang's picture

Mobile phones are rapidly evolving and becoming capable of performing tasks that were once predominantly accomplished on computers. These days, smartphones are powerful enough to accomplish your on-the-go needs and could easily replace your netbooks, music players and handheld game systems. However, its increased functionalities and extensive features are also reasons why smartphones are just as prone to attacks as any personal computer.

The most common risk is: losing your phone. What happens to all your personal information when you lose your phone? The person who picks up your phone has access to all your contacts, smses, e-mails and can even tell which banks you use by looking at which bank’s applications are installed on your phone. Never store sensitive information, such as passwords, on your phone and when you set a password, use a strong password so that it cannot be guessed.

Be cautious when clicking links while browsing the web or opening smses and e-mails (including their attachments) from unknown senders. We know not to trust these links and attachments from using the computer and the same rules should apply for smartphones. Untrustworthy links may take you to malicious websites that are able to retrieve information from your phone and the attachments may contain viruses that can spread to the people in your contact list.

Applications, even though adding to the functionality and usability of smartphones, also present a security threat. As applications are developed by third-parties, containment and control of deceitful applications remain an issue. Although Google and Apple make an attempt to remove malware from their application stores, detection is difficult and actions are usually taken after the harm has been done. Thus, users should be careful of who the developers are and be attentive as to which services the applications are requesting access to. In a particular instance, a number of Android users fell victim of a sms trojan disguised as a media player. These users were unaware of the media player requesting for smsing features and installed the trojan, thus racking up an expensive telephone bill. Jailbreaking phones, as seen in jailbroken iPhones, can also create security holes for malware to attack.

An interesting experiment, done by a group of researchers from the University of Pennsylvania, tested that passcodes of smartphones can be identified with the smudges on the touchscreen. The experiment showed that 92% of the smudges were partially identifiable and as high as 68% of the smudges were fully identifiable. So in the event of oily smudges, you may want to wipe your screen after use.

One thing for sure, like when you purchase a new computer or when you reinstall your operating system, you will install antivirus and firewall programs to protect your computer. Do the same for your smartphones as there are applications in the application stores that can provide these services for your smartphone, free and paid.

[1] Mills, E. (January 5, 2010). Using your smartphone safely (FAQ). Retrieved on August 30, 2010, from
[2] Constantin, L (August 10, 2010). Premium SMS Trojan Targets Android Users. Retrieved on August 30, 2010, from
[3] Bradley, T (August 11, 2010). Smartphone Security Thwarted by Fingerprint Smudges. Retrieved on August 30, 2010, from


Griffith's picture

jQTouch is a jQuery plugin with native animations, automatic navigation, and themes for mobile WebKit browsers such as iPhone, Nexus One, and Palm Pre. jQuery is a fast and concise JavaScript library that simplifies HTML document traversing, event handling, animating, and Ajax interactions for rapid web development. The combination of jQuery and jQTouch allows any web application developer with little experience in jQuery to build mobile applications.

As an example, consider a simple home panel. This is the HTML for Home panel:

<link type=”text/css” rel=”stylesheet” media=”screen” href=”jqtouch/jqtouch.css”>
<link type=”text/css” rel=”stylesheet” media=”screen” href=”themes/jqt/theme.css”>
<script type=”text/javascript” src=”jqtouch/jquery.js”></script>
<script type=”text/javascript” src=”jqtouch/jqtouch.js”></script>
<script type=”text/javascript”>
var jQT = $.jQTouch({
icon: ‘todo.png’,
statusBar: ‘black’
<div id=”home”>
<div class=”toolbar”>
<ul class=”edgetoedge”>
<li class=”arrow”><a href=”#about”>About To-Do</a></li>
<div id=”about”>
<div class=”toolbar”>
<h1>About To-Do</h1>
<a class=”back” href=”#”>Back</a>

The HTML is simple, its body composed of two divs as children. The integration with jQuery and jQTouch is in the HTML head. An analysis of the HTML document head:

• jqtouch.css is a required file, defining structural design rules specific to mobile devices, including animation handling, and orientation.
• theme.css is a CSS theme included with jQTouch.
• jquery.js is the core file of jQuery framework. jQTouch requires jQuery, and it comes with its own copy of jQuery.
• jqtouch.js is the core file of jQTouch.
• variable jQT is the jQTouch initiated object. Two properties are defined in the Home panel: icon and statusBar.

jQTouch has several properties that allow users to customize the behavior and look of their apps. In this case, icon indicated the custom Web Clip icon, and statusBar controls the color of the strip at the top of the app in full screen mode.

All in all, jQTouch is a framework that facilitates developers to add native-looking animations to a web app.

Gearman Distribute Computing Framework

Today many web services will need to to do some complex work which can cost some computations. As a result, we need some kind of job dispatcher to help us to dispatch jobs. Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates.

Here are some features about Gearman:

  • load balance
  • Client API: C, PHP, Python, Perl
  • no broadcast feature
  • auto dispatch task
  • persistent queue (sqlite, mysql)
  • client
    • blocking task
    • non-blocking (concurrent multi-task), callback function when complete or fail!
  • worker
    • blocking task handle

Under the following picture, you can see how Gearman works. Your application will use Gearman Client API to send task to Gearman Job Server. The Job Server will find which Worker had low workload, and send the task to the Worker through Gearman Worker API. All communication is through Gearman API, you don’t need to consider any TCP connection. And Client Node, Job Server, and Worker Node can be in different machine.

Job Server fail over:
The Worker and Client need to connect to all Job Server (it’s ok to set 2 or 3 Job Servers). Then if one Job Server fail over, there is still one Job Server remain. The Client can still work.

No monitor tool, the cluster monitor tool is still under development.

Tokyo Tyrant - part 1. (Backup/Restore)

Tokyo Tyrant is a high concurrency network interface to the underlying database (Tokyo Cabinet).

It has been widely adopted in many web-related applications as backend data store.

This post will introduce Tokyo Tyrant's backup and restore capabilities.


figure 1.

The following example shows how to hot backup a ttserver, and recover a ttserver using the backup database as illustrated in figure 1.

 - first, start a ttserver.

ttserver /tmp/ex1.tch

 - on another terminal, write some data into it.

tcrmgr put localhost "foo" "bar"

 - check if we have a record (key:"foo", value:"bar").

tcrmgr get localhost "foo"

 - backup the database.

tcrmgr copy localhost /tmp/backup.tch

 - let's simulate the ttserver crashed by press crtl-c in the first terminal.

 - and database file is corrupted.

rm /tmp/ex1.tch

 - to recover the ttserver from previous backup, simply copy the database file and restart ttserver.

cp /tmp/backup.tch /tmp/ex1.tch

ttserver /tmp/ex1.tch

 - check if the data has been recovered

tcrmgr get localhost "foo"

Data loss can be prevented by scheduling periodically backup (once a day or a few minutes, depends on how frequently the writing operations are). But we still gonna lose all the modifications between the last and the next backups. Tokyo Tyrant also provide an approach to restore those modifications.


figure 2.

The following example shows how to restore the modifications from update log as illustrated in figure 2.

 - create a directory to hold update log.

mkdir /tmp/ulog

 - start a ttserver with update log enabled.

ttserver -ulog /tmp/ulog /tmp/ex2.tch

 - on another terminal, put some data into it.

tcrmgr put localhost "foo2" "bar2"

 - check if we have a record (key:"foo2", value:"bar2")

tcrmgr get localhost "foo2"

 - let's simulate the ttserver crashed by press crtl-c in the first terminal.

 - and the database file is corrupted.

rm /tmp/ex2.tch

 - instead of recovering the ttserver from backup, we can restore the database by replaying the update log.

 - backup the update log if needed.

mv /tmp/ulog /tmp/ulog-back

 - restart ttserver.

ttserver /tmp/ex2.tch

 - restore from update log.

tcrmgr restore localhost /tmp/ulog-back

 - check if the data has been recovered.

tcrmgr get localhost "foo2"

Cloud-Based Gaming with OnLive

June Huang's picture

With OnLive's on-demand gaming service, PC gaming has gotten cheaper and easier. There is no need to have high-end computers with fancy graphics cards and fast CPUs to play the latest videos games. Low-end computers running Windows XP, Windows Vista, or Intel-based Macs running OS X together with a decent Internet connection is enough to get started. Users can also gain access to OnLive using their televisions with the OnLive MicroConsole. OnLive enables users to play or rent games, try out game demos and play multi-player games with other users using the OnLive service. There are also community features such as speculating live games, recording and sharing gameplay videos and accessing gamer profiles.

Essentially, the user just needs to know how to use a browser. Game data and interactions are sent from the browser to the OnLive servers for processing. Once all the data has been computed, a compressed video stream is sent back to the user's browser and the user continues to play the game. To the user, the gameplay is real-time and will feel no different than playing with a local copy of that game. It is convenient for the user, because OnLive eliminates installing and updating the games and the need for local storage space.

The OnLive service achieves this instant access with a combination of remote servers working dedicated or shared to produce continuous gameplay for the users. Game data are stored and processed on these servers and their hardware are upgraded every six months to provide users with optimal processing power. Each server has a particular task like handling the user interface, running the games and streaming video. There are also several classes of servers depending on the requirements of the computations and the number of connections. Thus, during a session, a user is passed to several servers depending on their state of play and processing requirement.

With all the data transmissions happening in the background, it is obvious that the OnLive service is dependent on and limited to the user's broadband connection and region. OnLive claims that high-definition quality is achievable with video at up to 1280x720 resolution and a frame rate up to 60 frames per second with a connection of at least 5 Mbps. However, the slower the speed, the lower the resolution and frame rate. With a 1.5 Mbps connection, a standard-definition quality is obtainable, but it may be insufficient to play a real-time, action-packed game because the video feedback may not be as smooth as playing on a local machine. Also, with the compression of the video, some of the art details in the scenes are lost.


Griffith's picture

Node.js is an evented I/O framework built on top of Google’s V8 JavaScript engine; its design is influenced by systems like Ruby’s Event Machine or Python’s Twisted. Node’s goal is to provide an easy way to build high performance, real-time and scalable web applications.

JavaScript has traditionally only run in the web browser. In recent years, projects such as CommonJS, Jaxer and Narwhal reflect the considerable interest in bringing JavaScript into the server side as well. In contrast to these concurrency models where OS threads are employed, Node is event-based rather than thread based. Thread based model often has the disadvantage of not scaling well with many long-lived connections necessary in real-time applications, becoming relatively inefficient and complex.
Node takes an alternative approach by telling the OS that it should be notified when a new connection is made, and then it goes to sleep. In the event of a new connection, the callback is executed; each connection is only a small heap allocation. This results in a much better memory efficiency under high-loads than systems which allocated 2mb thread stacks for each connection. Furthermore, Node is free of locks: almost no function in Node directly performs I/O, so the process never blocks. The programmers won’t need to worry about dead-locking the process.

Node’ advantage comes from the fact that most thread based models spend the majority of their time waiting for I/O operations which are much slower than memory operations. Node’s I/O operations are asynchronous, which means that it can continue to process incoming requests while the I/O operation is taking place.

The following is an example of a web server written in Node which responds with “Hello World” for every request.

var sys = require(‘sys’), http = require(‘http’);

http.createServer(function(request, response) {
response.writeHead(200, {‘Content-Type’: ‘text/plain’});
res.end(‘Hello World\n’);

sys.puts(‘Server running at’);

This simple script imports the sys and http modules, and creates an HTTP server. The anonymous functions passed to http.createServer will be called every time a request is made to the server.

Node is a very exciting technology built on top of another powerful technology, V8. It has gathered a lot of attention within the technology community, and with its great module system, there are many third party modules available for just about everything.

Opera unite: Your Browser is Now a Web Server

Ricky Wu's picture

New Opera Unite technique blurs the boundary between client and server after new version of Opera browser released. You can also have your own dedicated web server by walking through few step of simple setting up. With opera unite you need to find web hosting no more but can sharing your files, documents, videos and pictures to anyone who permitted to access you web host. Of course you should open up your opera unite and keep your computer awake to ensure these service continuing.

Opera unite let your PC acts like a client or a server. Not like the traditional installation of  the web server, It simplify the setting steps make user configure their own server more convenient and easy, such like reduce the setting up of port forwarding in traditional network setting. And it has characteristic of cross platform and structure based on open architecture network also reduce the complexity of web service developing for developers.

Opera unite comes with six basic services, File Sharing, Fridge, Media Player, Photo Sharing, The Lounge, and Web server etc. The File Sharing service allows you sharing any type of files with friends no matter how big it is. And you also can make your own access limitation rule to make file sharing more private. You can also browse your music and play it directly through Unite Media Player in anywhere you want the only require is you need to connect to internet. The Lounge creates a chat room that you can host in your computer, Fridge for friends and family to leave virtual sticky notes on. And it provides a more private and secure platform to let people possible to pass instant message without install any instant message applications.

In the future development, Opera Unite provides an open architecture platform that broken old fashioned client-server architecture rule. Maybe the easy to use able to provides an alternative option of peer-to-peer model to online users, even if it will take place the recently centralized peer-to-peer community under development of network in the future. In this architecture, users manage their private information on their own host, so personal information will share with others in more safe and reliable way. For developers, this new technique provides a lower-cost system required method to built-up development environment, and accelerates development cycles for network service development. Besides, the open network architecture has an advantage in flexibility of diversity of service development. For example, if recent online games such like ‘Happy Farm’ reconstruct without centralized model, I think it will become smaller and growth in various way.


Read White Web: Your Browser is Now a Web Server: Opera Includes Opera Unite in Opera 10.10

Cassandra Introduction -- data model


With the more and more data insertions and queries from the database, we may face the situation that we need to scale out the architecture by increasing new machines to handle the amount of data. However, in the traditional MySQL database, it needs a lot of work to add a new machine (i.e. shading, we partition the data into different machines). And sometimes only key-value queries are needed instead of JOIN operation. We can't help but think that if there is an alternative solution for database system scalability. By searching on the internet, we find many distributed key-value database are develop for this situation. Among these database systems, Cassandra is a java-based distributed key-value database which is created by Facebook. It is different from MySQL which contains the JOIN operation, Cassandra is good at dealing with the distributed data. You may view the whole cluster as a big hash table with all fault tolerant and data partition are handle by it. It provides "incremental scalability" (which means you can increase throughput by adding new nodes). And Cassandra also supports "Column" feature, it is more convenient than only key-value database systems.

Let me show you the key elements of Cassandra :

Basic key-value database:

Table['key1'] = value1

With Column feature:

Table['Key1']['column family1']['Column1'] = Vaule1

Data Model:

In Cassandra, it can be thought of as a four or five dimensional hash table. From top to bottom, the hierarchy looks like this.

So the query will look like this:

get <ksp>.<cf>['<key>']['<col>']                             Get a column value.
get <ksp>.<cf>['<key>']['<super>']['<col>']              Get a sub column value.

Key Space:

    In Cassandra, you can define many Key Space. You can think it as the Table in MySQL. It contains {Row, [ColumnFamily]} list. Normally one Key Space per application.


    For row key, you can have data from relative Column Family. The data in each Column Family is sorted according row key's order. The row key does not have to contains data in all column family.

Column Family:

    In Column Family, it contains a list of Column or a list of Super Column. You must define it in config before Cassandra start. And each Column Family is stored in a separate file. The number of column in each column family is unlimited.


    It is the smallest element  of data, and it only contains a name, a value, and a timestamp. You can add new or delete column at anytime.

Super Column:

    Super Column is the container to  contain Columns.


Cassandra use consistent hash to do key distribution and partition. Each node in Cassandra cluster will take a token (0<token<2^32) in the ring. The size of the ring is 2^32. When the key is coming, it will make the md5 hash for the key and find the smallest token which is larger than the key md5. The the key is mapping the correspond node according to the token, so the data will be store in the corresponding node.

Like the following example, the key will be inserted into node 2.

Replicate method:

If you want to store two replicas of data in Cassandra cluster. It will store data in the next two nodes.

Adding a new node:

In consistent hash method, adding a new node will only affect the nodes in neighbors. In this case, we do not need to rehash all data. Some data store in node 1 will now store in new node 4. The new node will choose a token randomly, and find the corresponding location according to the md5 hash. 


Diaspora - the privacy aware, personally controlled, do-it-all distributed open source social network

This newly announced project is featured in New York Times on May 12 - "Four Nerds and a Cry to Arms Against Facebook". First line of the article says "How angry is the world at Facebook for devouring every morsel of personal information we are willing to feed it?".

Almost all social network services presenting today are centralized, such as Facebook, Twitter, Orkut etc. we fill out personal information to register as an user, hand over messages via their servers to communicate with our friends. In the mean while, what we are giving up is all of our own privacy. That may increase data leakage and we have to be more cautious about what we are posting on these social networks.

A few months back, four geeky college students of NYU (Mr. Salzberg and Mr. Grippi are Raphael Sofaer, 19, and Ilya Zhitomirskiy, 20), decided to build a social network that wouldn’t force people to surrender their privacy to a big business in exchange for convenient access to their sites. They have called their project Diaspora and intend to distribute the software free, and to make the code openly available so that other programmers can build on it.

The Diaspora group was inspired to begin their project after hearing a talk about "internet privacy" by Eben Moglen, a law professor at Columbia University. As more and more of our lives and identities become digitized, Moglen explains, the convenience of putting all of our information in the hands of companies on “the cloud” is training us to casually sacrifice our privacy and fragment our online identities. Why is there no good alternative to centralized services that, as Moglen pointed out, comes with "spying for free?”

“When you give up that data, you’re giving it up forever”

“In our real lives, we talk to each other, We don’t need to hand our messages to a hub."

"Our real social lives do not have central managers, and our virtual lives do not need them." — said by Diaspora group.

The project is described as a "network that allows everyone to install their own “seed” — i.e. a personal web server with a user’s photos, videos and everything else — within the larger network. That seed would be fully owned and controlled by the user, so the user could share anything and still maintain ownership over it".

It would take three or four months to write the code, and they would need a few thousand dollars each to live on. They gave themselves 39 days to raise $10,000, using an online site, Kickstarter, that helps creative people find support. They announced their project on April 24. They reached their $10,000 goal in 12 days, and the money continues to come in: as of today (May 24), they had raised over $180,000

Not bad for a financial start to turning an envisioned new network into reality. It is far too soon to tell whether Diaspora will replace Facebook and become the next top social networking website, however due to the ripe timing and tremendous amount of support, it might just have a shot.

Cross-platform C++ libraries for system and network programming

Angus Liu's picture

For years, C++ users have complained a lot about lack of libraries to build system and networking applications. Compared to other Object-Oriented languages like Java and C# which enjoy abundant built-in classes and functions in hand, C++ is somewhat awkward. Programmers need to write code from scratch using native system APIs, or look for existing solutions provided by software vendors. Building everything from ground up can be painful if they don't have a firm grasp of OS APIs and the code itself is not portable as well. Given that in mind, some intelligent programmers have written and share their libraries to address the issue.

Adaptive Communication Environment (ACE)

ACE has been around for quite a long period. It was first developed by Douglas C. Schmidt during his graduate work at University of California, Irvine. Based on my experience, the source code itself is kind of old-style C++ with lots of Macros inside. With plenty of classes and modern design patterns incorporated, it is considered to be complex and require a long learning curve to master. But due to its glorious history, ACE supports most operating systems, even those you have never heard of.

Poco C++ Libraries

You may think Poco as a revised version of ACE with much more clean codebase. To certain extent, It covers what ACE covers plus a lot of de-facto standard C/C++ library, say, PCRE, zlib, etc. Poco is well-documented and source code is quite self-explained. I would recommend it for a beginner who wants to try out such a library.

Boost C++ Libraries

Boost is another great library you should never miss out. It has a bunch of useful utility libraries more than you can expect and you will be amazed by the power of C++ templates used in these libraries. In the recent version, it also included a network library called ASIO which is worth to explore. However, when it comes to debugging, the template-based code may not be a pleasure to trace. But if you were a C++ geek, you would love it!

Cellopoint Cloud Series 1: the Past, Present, and the Future of Cloud Computing

What on earth is the currently hottest Cloud Computing? What is its difference from the Grid Computing? This article will take you to the origins, conceptions, and related applications of Cloud Computing. You might have heard another noun, Grid Computing, before Cloud Computing was stirred up. Many people consider Grid Computing & Cloud Computing very much alike. In fact, there is no strict segmentation between the two concepts. They are both considered the concepts derived from Distributed Computing.

Grid Computing VS Cloud Computing
Grid Computing:

It is made of the virtual computing cluster by using the un-used resources (CPU resources & Disk Storage) from a large number of heterogeneous computers (usually called Desktops), and it provides a structure for solving massive computing problems. Grid Computing focuses on the abilities of cross-domain computing support. With Parallel Computing applied, it focuses on the full-use of resources between and across the companies to jointly solve the tough computing tasks。

Cloud Computing:
It is a kind of dynamically scalable computing. The basic concept is to divide the task of computing into several processes. After they are processed and analyzed by the servo group (cloud hosts) distributed over the Internet, the outcomes will be returned to the end-users. Although Cloud Computing originates from Parallel Computing, it is not away from the concepts of Grid Computing. But, Cloud Computing focuses more on the processes of data.

Mainstream Cloud Technologies:

MapReduce :
It is the key technology that Google applies to Cloud Computing, which allows developers to develop more programs that process massive data. First, it divides the data into unrelated segments through the Map program for a large number of computers to process. Results are further gathered and integrated through the Reduce program. Then it outputs the outcomes required by developers.

Hadoop is an open-source program inspired by the Google Cloud Structure. The structure of Hadoop is implemented with the concepts proposed by the Google BigTable and the Google File System. It is written in Java, which can provide a Distributed Computing environment for massive data. But the Distributed File System used is different from Google’s. Yahoo is the main contributor and user of the program.。

Service Patterns of Cloud Computing
The application of Cloud Computing usually provides the clients through the Internet with information technologies, including computation, storage, and bandwidth, in a virtual form of “services”. Through Cloud Computing, users only have to take the services as Black Boxes and input the actions required. They don’t have to know the operations inside the Boxes. They only have to wait for the outcomes returned.
Three patterns based on service categories:

1. Software as a Service, SaaS
The SaaS is a pattern of acquiring the software deployment through the Internet. It provides the company with the Software on Demand from the front-end office applications, such as Email and word processing to the back-end data analysis, customer relationship management, business process management, and human resource management. Representatives are Google, Salesforce, Microsoft, etc.

2. Platform as a service, PaaS
The PaaS is a kind of combination of Servo Hosting Platform and a Virtual Solution. Users don’t have to construct the hardware hosts and the operating systems by themselves. Through the rented Internet, PaaS service providers provide the Virtual Hosting Platform, which saves software & hardware maintenance and labor & time management. Through the PaaS, software providers can focus on the software development and accelerate function deployment online. Well-known developers are Amazon web service, Google App Engine, etc.

3. Infrastructure as a service, IaaS
IaaS makes the IT infrastructure kind of service. The company outsources the structure required within the company to the IaaS contractors. Compared with the costs of ordering hardware, software, storage, power, and the bandwidth for construction of traditional computer room, the company can acquire the IT resources more efficiently by paying per use. The concepts of the Private Cloud & the Hybrid Cloud are extensions of the IaaS。Private Cloud makes the exterior resources interior within the company through the VPN; Hybrid Cloud, which integrates cloud services from different providers more flexibly, combines the Public Cloud/SaaS and the Private Cloud. Sensitive data are served by the Private Cloud while non-confidential data are served by the Public Cloud of lower costs.

More and more suppliers are investing in the cloud services, and that means the Cloud service Market has become the trend for the future. The rise of the market means that the company can lower the construction costs of information services and that it can focus on the core of its operations to improve efficiency and competitiveness. However, Cloud services also bring about many problems, such as security apprehensions, whether or not the Service Level is sufficient for dealing with the daily operational requests from the company, the compatibility with the existing systems, etc. In the presence of the Cloud security problems, the next article will take you to the new technologies and its applications developed by information security providers.

CelloCloud™ protect you from H1N1 Spam

Hackers usually use the most popular things that people are talking about to send the spam mail. By the spare of the H1N1 globally, the relative topic of H1N1 spam mail are all over the place. CelloCloud™ Threat Sensor System already found out many cases of spam mail which are using the H1N1 as the topic to attack personal computer. They are using a very attractive topic such as ”Madonna caught swine flu!” or ”Swine flu in USA”“ to let the receiver to click the website or download the Trojan to personal computer to steal the personal information or combine with Flash to attack the unguarded computers.

Cellopoint wants to remind everyone 1. Be aware of the suspicious email. Do not open or click the links inside an email. Do not give away any personal information such as bank account number, password…ect in the email. None of the companies would request this kind of information from their user. 2. Do not reply to spam, as it will let Spammer know your email address is valid. Then they will send more spam. In addition, a number of spam contains unsubscribe links will create the same result. The best way to deal with spam is to delete without reading it. 3. Watch out for social-engineering trap. Hackers become more sophisticated and often trick individuals to enable malicious code attacks (Spear Phishing). 4. Do not forward chain letters. This special kind of email may be created by hackers to collect email accounts for the production of spam.

CelloCloud™ Threat Sensor System relase the anti-spam database update when the system discovered the threat of the Swine spam to protect their global clients. CelloCloudTM provides “Global threat protection” and “ Online update protection” functions. It can help for anti-spam, virus, anti-spy, phishing, anti-reply, DoS attack, hackers threat …ect. CelloCloudTM Threat Sensor System just like a safety cloud to prevent our customers from the threat and reach our goal of “Cloud Security for Email”

Free Email is an accessory for hacker to attack job hunters

Due to the economic resection ,there are more and more unemployed people looking for jobs on line and it gives the hackers a perfect chance to defraud those job hunters. Cellopoint Global Anti-spam Center (CGAC) has found and intercepted a huge amount of spear phishing messages which contain messages like “Thank you for applying xx position. After reviewing your resume, you are not qualified for this position. We decide to send your resume back to you…”This email seems normal with the link of the company website. If you open the attached file, it would not be your resume that you are looking for. It would be Trojan Horse. It is impossible for the job hunters to memorize all the companies' names and jobs that they applied for. This is the reason that job hunters are the victims of these false emails.

Cellopoint Global Anti-spam Center (CGAC) thought that these spear phishing attacks are showing the new change of the social behaviors. Besides those Botnet computers which have been attacked by Trojan, hackers also use those free web mail servers as the step stone to attack regular user. For the service provider, this not only slows down the efficiency of the mail server but also becomes the black list which would effect the basic function of sending or receiving mails. If the service provider wants to promote a better email service by charging their customers, the black list would be the biggest problem for their future business plan. The Executive Yuan of the Republic of China has already passed the new law of “the management of sending business email” which says that the email service provider must prevent the spam of business email. If the email service provider can not forbid the spam, they will receive the find until they do something to stop it.

Cellpoint email security and management solution also called Email UTM. It got the first place of the Ites Best Choice of “Anti-spam” in 2008 by Institute for Information Industry. From Email UTM, it included CGAC online guarding service about anti-spam, virus email, anti-spy, phising, anti-Relay, anti-Dos, anti-Hacking to secure the safety of email transferring, Digital Signature to solve the problem of counterfeit email. Cellopoint Policy Center can classify the email into different categories between business email and regular email. It provides IP Pool management. This can avoid the regular email IP to be listed in the spam blacklist. It can forward, delete, quarantine, notice the inspect or secure copy…ect. This can increase the efficiency of the system dramatically for all the clients include the service providers, businesses and organizations.

Cellopoint warns of Valentine’s Threat

With Valentine's Day just around the corner, email threats hided in Valentine’s Card are also awakening. Cellopoint Global Anti-spam Center, CGAC has a warning for internet users: The surge of Valentine Day attacks come from notorious Waledac botnet and disguise as E-card format. This kind of spam carries links to get users to visit malicious sites. Instead of real greeting cards, malware will be downloaded and compromised their computers. The infected computers will become part of botnet and send out spam and virus without awareness as well.
This kind of spam is short and sweet one liner with content like: “Me and You”, “In Your Arms”, “With all my love” and “I give my heart to you” followed by an URL. If you receive an email above and similar to the title, you should be careful. Do not open it without double confirm. Besides, tax refund and online booking confirmation also increases in amount.

Cellopoint provides a few tips to stay away from spam:

  1. Use an email security solution. This solution should protect against inbound email threats and viruses while ensuring transmission of legitimate email messages without delay. It should maintain a very low false-positive rate.
  2. Educate users on secure email practices. Be careful with suspicious email. Never fill out forms in email messages that ask for personal or financial information or passwords. Remember that legitimate companies will never ask for this type of information via email. Avoid opening suspicious emails and clicking on suspicious links.
  3. Do not reply to spam, as it will let Spammer know your email address is valid. Then they will send more spam. In addition, a number of spam contains unsubscribe links will create the same result. The best way to deal with spam is to delete without reading it.
  4. Watch out for social-engineering trap. Hackers become more sophisticated and often trick individuals to enable malicious code attacks (Spear Phishing).
  5. Do not forward chain letters. This special kind of email may be created by hackers to collect email accounts for the production of spam.

About Cellopoint
Cellopoint is a leading provider of email UTM (Unified Threat Management) solutions for organizations ranging from small businesses to large enterprise and ISPs. We defend against email threats such as spam and viruses, prevent leaks of confidential data by content filtering and secure mail delivery, archive email to protect your digital assets, comply with regulatory inquiries and corporate investigations in a single, web-based platform. We provide the maximum reliable, scalable and flexible solutions to help you deploy and manage easily. For more information, please visit:

Botnet goes back after McColo shutdown

The notorious botnet hosting, McColo has been taken down by a group of Internet Providers on Nov 10 and total spam production dropped as much as 50 percent. The action followed investigations by security researchers that found that McColo was found to become the preferred home of for many botnets' command and control servers, including Rustock and Asprox. Now that Cellopoint Lab has found that spam volumes are rising up again after decreasing four weeks ago when a rogue hosting company was shutdown. The volumes dropped for 9 days and are on the rise. The reason that may account for could be some botnets are awakened or regenerated. Spammers seem to try many ways to send out spam. The Mega-D botnet, well-known for producing "billions" of spam, most of which promote sexual performance drugs such as Viagra has worked effectively over the last three weeks to set up new command and control servers and re-establishes connections with its networks of compromised bots. And other famous botnet, Srizbi and Rsutock have also come back. The botnets' return comes as no surprise to the information security industry. The spam should be monitored, despite its dropped volume. Organizations still need to remain the same level of security as usual. To help protect against many email and internet threats, Cellopoint recommends the following: spam filtering and email anti-virus. Rather than rely on any single piece of anti-spam and anti-virus product or technology, deploy multiple layers of security throughout the organization by Email UTM.

About Cellopoint
Cellopoint is a leading provider of email UTM (Unified Threat Management) solutions for organizations ranging from small businesses to large enterprise and ISPs. We defend against email threats such as spam and viruses, prevent leaks of confidential data by content filtering and secure mail delivery, archive email to protect your digital assets, comply with regulatory inquiries and corporate investigations in a single, web-based platform. We provide the maximum reliable, scalable and flexible solutions to help you deploy and manage easily. For more information, please visit :

Personal email accounts and data loss prevention

A few days ago, American Republican vice presidential candidate Sarah Palin's yahoo e-mail account had been compromised by hackers. Parts of the contents of the message were available to download. In addition to a number of personal photos, nothing makes Palin embarrassed. But news pointed out that Palin had consulted with public affairs via this personal email account, it may try to avoid the law. At present, a 20-year-old Democratic Tennessee state representative’s son is suspected and had relations with this. FBI may investigate with him soon.

Not only political figures, the executives of companies have also suffered from target attacks. For corporate governance, it is necessary to prioritize the policy for the usage of personal email accounts. If unable to control it, it had better to limit it to prevent employees from inadvertent forwarding of email containing product development or business plans to other personal email recipients intentionally or not. Cellopoint proposed that businesses or organizations can implement policies and control e-mail messages with auditing tools. Scan the contents and detect improper behaviors of incoming and outgoing messages. If employees may leak sensitive information to external email addresses, the auditing tool should instantly quarantine the email and notify the auditors or the manager. It results in good email leakage prevention.

A New Twist on Phishing: Fraudulent FedEx Email Attacks

In the wake of a flood of phishing email attacks masquerading as news bulletins, hackers have recently launched attacks disguised as FedEx express delivery tracking emails. These hackers use botnet computers to send emails with FedEx package tracking numbers telling recipients that the delivery of their parcel has run into some problem: the address contains an error, the recipient's name does not exist, customer reconfirmation is required, or pick-up is required. A compressed zip file is attached to the email, and the customer is asked to decompress the file, print it out, and send it back. The zip file is actually a malicious program, however, and if opened by an unsuspecting recipient, will automatically install a backdoor program that can steal sensitive data on the computer. This type of email attack relies on social engineering. For instance, a package tracking number may be used to obtain the recipient's trust, or the email may provide notification of a package ready for pick-up. And since there is a compressed zip file, a backdoor program can be installed on the user's computer without the user visiting a malicious web site. CGAC immediately issued an anti-spam database update after detecting this type of email attack on the 22nd; the update will protect users by effectively controlling the spread of the attack and fraudulent email volume.