What is memcache?

HH Tu's picture

Today, I will introduce an useful technique in fetching database - Memcache. it is a distributed memory caching system, we can build a highly efficient cloud system with it. The basic concept behind this method is to use key-based structure to fetch and store data into memory. The original idea comes from Brad Fitzpatrick who used this method to enhance LiveJournal.com(2003). There are lots of website which use this method: LiveJournal, Wikipedia, Flickr, Twitter, Youtube, Digg, WordPress.com…etc. It can reduce almost all the databases loading time, and has better access and resource utilization to the database when a Memcache miss happened. It got key-based cache & distributed memory object caching system, but the authentication needs to control by the users.

It is good to store frequently used information to reduce the need to retrieve. The simplest example is like when you browsing on the Internet, most of the website contents will be downloaded into your folder, it is used to improve the speed when you browsing same website in the next time. Memcache system use the same viewpoint. It takes part of your computer memory to make your computer faster access, deployed and accessed from anywhere over a network, and you can create more and more cache as you want(of course, you need enough memory), and even more, it treats all cache as one single node which means you can combine several computer memory and use together!! What a wonderful mechanism. All operations should run in O(1) time.

Here is a simple example to illustrate the usefulness in combing memory together. We fetch data from a server every day, and we want to speed up, so we add one more server, how to use it efficiently?

In pic(1), we got two server two memory, in order to ensure same results when you stored and retrieved from any server, you need to copy every data into another server's memory, it wastes time and memory, not a good setup. In Pic(2), we use Memcache method, then you can store and retrieve from same location in your web servers, there will be no in-consonance happened. More memory, more cache!!

When should we use Memcache? when you use lots of “SELECT * from XXX” from database, and have high probability to use and use again, you can use it happily. Here is an easy analysis, when you start to use Memcache, you can consider the following situations: 1. The search timing(How often) 2. The hit timing(What accuracy) 3. Validate?(How long). Of course you need to pay a little extra works to handle this, this should be included.

I give a brief procedure for how to implement your code with your database. Assuming you have lots of servers need to connect to each other for better memory using. Here's an simple example flow: Your clients ask servers for data, and your servers ask Memcache first, if the data is not in your Memcache does not have this data, then you go to fetch your database. Once you got the new data, remember store it into Memcache to increase the hit rate in the next time.

In the above example, you can understand Memcache is implemented as a network daemon. Most people use PHP or C/C++ to communicate with Memcache. I use it in the Linux system, if you wants to use it with C/C++, you need to install some basic package in your Linux: 1. libevent 2. Memcache 3. libmemcache.

The other details are all in the Official website: http://memcached.org/