Nov 29, 2012

How to set up Apache Traffic Server on Ubuntu 10.04


Note : This blog post of mine is for general reference and I don't provide any sort of expert advice here. Yeah you can contact me through comment below for any queries and I will try to answer those. 

I was discussing with my friend regarding one of his problem related to multiple Word Press (on different domain) hosting on same server and single static IP Address. Well at the time of our discussion he had configured two domain name based virtual host on Apache Httpd server installed on the physical box running Ubuntu 10.04 LTS. 

As he started talking about the problems which he was facing as traffic on both of his websites are increasing, I realized he need to upgrade his System to handle high traffics.

Well For solving his problem up to an extend we had decided to use web accelerator like Varnish or Squid in front of Apache Httpd. Both of the web cache proxy is here for a while and proven with good track record. While researching onto Web Accelerator I encountered the Apache Traffic Server, which has become top level project on Apache in span of 2 years. Well I had gone through the benchmark results for all the three and found Apache Traffic Server to leading the chart. (See the reference section for detail)

Well I was very excited to bootstrap Apache Traffic Server to work as cache proxy for two domain name based Virtual Host hosted on single IP. Please read further to know more about the installation, basic configuration, logging and cache optimization. I guess its really an interesting stuff. 

Intro - Apache Traffic Server 

Apache Traffic Server is a reverse proxy, cache proxy and forward proxy. Web proxy caching enables you to store copies of frequently-accessed web objects (such as documents, images, and articles) and then serve this information to users on demand. It improves performance and frees up Internet bandwidth for other tasks. 

Official website for Apache Traffic Server is

Similar products: Varnish (


Installation of Apache Traffic Server is bit tricky on Ubuntu due to absence of standard apt-get easy installer package. Well but its not that difficult though. I am writing down step by step process for easy understandability.
  1. Download latest tar.bz2 from apache website and put that on ubuntu server any directory.
  2. CD to directory where you had kept the downloaded and then uploaded tar.bz2 file
  3. $ tar xf trafficserver-x.y.z.tar.bz2
  4. $ cd trafficserver-x.y.z
  5. sudo apt-get install g++ libssl-dev tcl-dev libexpat1-dev libpcre3-dev libcap-dev libcap2  (Dependency packages need to be install, it will be used for making the package on current system)
  6.  ./configure           
  7. $ make    (This will build with a destination prefix of /usr/local.)
  8. $ sudo make install   (This finished the installation)

Server Level Configuration

  1. After successful installation, open ($ sudo nano /etc/ and add following line .include /usr/local/libexec/trafficserver
  2. Run the following command:  sudo ldconfig

Start and stop

To start Traffic Server manually, issue the following command, passing in the attribute start. This command starts all the processes that work together to process Traffic Server requests as well as manage, control, and monitor the health of the Traffic Server system.
$ sudo trafficserver start/stop
All the configuration parameter related to cache are located in /usr/local/etc/trafficserver/records.config file after making any configuration changes do run command traffic_line -x to apply the configuration changes.

Installation Directories

When we install traffic server, we had not provide the installation directories for config files and executable. These stuffs has been placed with default conventions.

All configurations files : 
  1. /usr/local/etc/trafficserver/
All executables :
  1.  /usr/local/libexec/trafficserver

Important Configuration Files & Usage

$ sudo nano /usr/local/etc/trafficserver/records.config

For changing default port as well as connection time out etc. related data.

Don't forget to execute following command in case of minor rule change:
$ sudo traffic_line -x

In case of severe change like port etc please make sure you had restarted the server:
$ sudo trafficserver restart

For Configuring the mapping 
$ sudo nano /usr/local/etc/trafficserver/remap.config

In the above file, we can provide as many map, reverse map, redirect, reverse redirect configurations.

Apache Virtual Host Specific Configuration

Step 1: $ sudo nano  /usr/local/etc/trafficserver/records.config
In the file make following changes so config read like below
CONFIG proxy.config.reverse_proxy.enabled INT 1
CONFIG proxy.config.url_remap.pristine_host_hdr INT 1 (It make sure host name in http header doesn't change)
CONFIG proxy.configurl_remap.remap_required INT 1

Step 2: $ sudo traffic_line -x

Step 3: $ sudo nano  /usr/local/etc/trafficserver/remap.config





Note: Apache Httpd is running on port 8090

Step 2: $ sudo traffic_line -x

Above configuration is sufficient for setting the initial reverse proxy to listen on default port and serve the request based on domain name from provider (Apache)

In coming blocks we will configure and talk about the Caching and Logging for better performance and management.

Cache Configuration

CONFIG proxy.config.cache.ram_cache.size INT -1 (default value 1 MB per 1 GB Disk Space, alternatively you can set 20971520 (20 MB)) 

Monitoring Traffic & Log File

Log file location - var/log/trafficserver


Aug 5, 2010

Important Resources For Scala Programming Language

Scala is growing as a popular programing language on Java Platform. This post is all about the resource pointers for Scala.

  1. Official Web Site:
  2. On Line Book:
    • Programming ScalaProgramming Scala introduces an exciting new language that offers all the benefits of a modern object model, functional programming, and an advanced type system. Packed with code examples, this comprehensive book teaches you how to be productive with Scala quickly, and explains what makes this language ideal for today's highly scalable, component-based applications that support concurrency and distribution. You'll also learn the advantages that Scala offers as a language for the Java Virtual Machine.
  3. List Of Book

Jun 24, 2010

Google App Engine : Pro and Cons (Advantage & Disadvantages)

When Google App Engine (GAE) has been launched in 2008, everybody get excited about this cloud platform. Initially GAE has supported the Python programming language for building scalable web application (;-)). Python enthusiast has taken this move on high spirit. Well but after going through the documentation, there excitement has been changed into frustration. Everybody know the reason :

  1. We can't upload file on the web application, we have to save the uploaded stuff in blob, max size of the blob was 1 MB. (quantitative data may be not correct sometime,..beware..)
  2. Lack Of RDBMS feature on Google Patent Data Store Known as BIG TABLE.
  3. etc.
  4. etc.
Well apart from these frustration there were some benefits also, like :
  1. No Worry About Infra..
  2. Good Deployment Process...
  3. Loads Of Free CPU hours, Band Width, Data Space etc..
  4. etc.
Due to these surprised pack, Python Developers got shocked and none of the blog or article I found on the net who had given there better experience with GAE.

Meanwhile, A large Community Of Developers Which is known as Java Community (Lost in Buffer.) demanded the Google App Engine Team (Or this was reflection..) to introduce the War Deployment On The Mighty GAE. So In 2009 (some time) they had introduced Google App Engine For Java (GAE-J). 

I was always wondering Why the GAE team is not try to improve the core GAE on immediate basis, But they are always keep on commenting on their community forum regarding : "They have hell lot of important things to do".

As a Java Programmer I also get spell bounded with the initial buzz of Mighty GAE-J. I was very excited to know that, now we can test our web application on internet very easily. Well GAE-J publishing, deployment process is very quick. But there are several fundamental bottleneck which restrict it to be become useful cloud platform for the JAVA.
  1. We can't program Thread. So all the framework or java component which are using Threads can't be reuse on GAE-J.
  2. We can't use Struts 2 without work around.
  3. We can't use Struts 2 File upload without tweaking.
  4. For keeping our app alive we need to have request every 5-30 sec on to server, otherwise it will be reinitialized. And reinitialize is time consuming. There is no reserved memory for the application all the time.
  5. We can't use select statement on more than 1 data type (table or classes).
  6. We can't run the query on table direct, we need to have proper index for each query.
  7. When we are not having join on tables, scope of complex query get eliminated.
  8. When we are querying on a Table, If we would like to get records from 20-40 offset, without fetching 1-19 records, It is not possible.
  9. Later they had introduced concept of cursor but that is also useless, we need to do a loads of work around to use that.
  10. When we can't fetch the chunk of records without fetching the others. This is tough to implement Pagination effectively. And for any good Application this is not practical to place the tricks and tweaks based on criteria and other work around.
  11. Even with the restriction on querying on one table, If we wanted to live happy, still this is not possible as:
    1. We can't use inequality (> < >= <=) operator on more than one column (property). I know they are querying on indexes, and this is complicated situation, so they bared the user to use it (easier then finding a work around on data store level). In this situation we can filter the data in proper fashion and we have to do extra work at application layer, even on client side.
    2. We can't do full text search. We are boasting on scalable big table but due to not indexing kind for text and other larger data type, we can't use our favorite like operator. I try to do some hack by putting tags etc., but all the solution get collapse due to other restriction.
    3.  We will never know how many records are there in one table If we will not keep track of it in another table. We can't run aggregate functions on the table. 
  12. There are some time limit (30 sec I guess) set on to any request to carry out the operation otherwise that will be terminated.
  13. We are not having scope of using Apache Lucene or compass and there is no any Google Datastore based indexing component available. I am really surprised by this behavior Of G(Google Search Giant)AE team. 
Well I had listed various lacked feature in GAE which is baring it from even being a small web application. I had seen people using GAE for very tiny small sites with minimal data source usage in GAE site.

I will advice GAE team to work around for the above missing feature otherwise developer like me will love to invest the time in configuring his own ORDBMS server and other infra.. to host the applications.

Nice Time... :-)