2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A New York City subway train holds 1,200 people. This blog was viewed about 6,100 times in 2014. If it were a NYC subway train, it would take about 5 trips to carry that many people.

Click here to see the complete report.


Cassandra DataStax – Developer Guide with Spring Data Cassandra

We will discus Cassandra implementation :

Important Points:

Download and Installation:

  1. Tarball Installation

  1. DataStax DB
    • You need to register yourself with DataStax for download.
    • DataStax Enterprise – http://www.datastax.com/download#dl-enterprise.
    • Create These folders and changed permission:
    • sudo chmod 777 /var/log/cassandra
      sudo mkdir -p /var/lib/cassandra/data
      sudo chmod 777 /var/lib/cassandra/data
      sudo mkdir -p /var/lib/cassandra/commitlog
      sudo chmod 777 /var/lib/cassandra/commitlog
      sudo mkdir -p /var/lib/cassandra/saved_caches
      sudo chmod 777 /var/lib/cassandra/saved_caches
    • How to run Cassandra: Go to DataStax Cassndra installed folder on Mac/Linux/Unix env:                                                                                          
                                                                                                                                                                        cd /Users/<userName>/dse-4.5.2/bin</pre>
      <pre>sudo ./dse cassandra -f
      //This above command Cassandra DB on your local system. Hit enter to quit from ruining server in background and start CQL query console.
      sudo ./cqlsh
  • Create Schema:
    CREATE SCHEMA event_owner WITH replication = {‘class’: ‘SimpleStrategy’, ‘replication_factor’ : 1 };
    1. Schema(Keyspace) name: event_owner
    2. Table name: event_audit
    Table creation syntax:Please find the revised data model and details below.Note: I have used expanded names for easier understanding which can be shortened later on.
    CREATE TABLE  event_owner.event_audit (
    ctg    text,
    month       timestamp,
    ctgid      text,
    ver     timeuuid,
    userid      text,
    action      text,
    content     text,
    PRIMARY KEY ((category,month),cat_id,version)
    ) WITH CLUSTERING ORDER BY (cat_id ASC, version DESC);
    Sample Data
  • category | month                    | cat_id | version                              | action | content      | userid
    CC | 2014-01-01 05:30:00+0530 |   8000 | b3fc48e0-5608-11e4-aacd-e700f669bcfc |  DRAFT | json content | 155045940
    CC | 2014-01-01 05:30:00+0530 |   9000 | a4747460-5608-11e4-aacd-e700f669bcfc |  DRAFT | json content | 155045940

    • category
      • Commitcode / Part Association/ EventTag [CC / PA / ET]
    • month
      • 12 AM timestamp of first day of the month the change is made
    • cat_id 
      • Unique id for a particular category [Say inc axe of commit code it is cc_id say 9000]
    • version 
      • it is the unique id that indicates the version number. 
      • You can populate it using now() function. 
      • It has an embedded timestamp that can be used to know the timestamp. Use dateOf() function to get the timestamp value.
    • userid
      • id of the user who made the change
    • action 
    • content 
      • actual json content after the change [full json]

  • Sample Query to access the data
    cqlsh:cdb> select * from audit where category=’CC’ and month=’2014-01-01 05:30:00+0530′ and cat_id=’9000′;
    category | month                    | cat_id | version                              | action | content      | userid
    CC | 2014-01-01 05:30:00+0530 |   9000 | a4747460-5608-11e4-aacd-e700f669bcfc |  DRAFT | json content | 155045940
  • SQL Like Commands: You can use same standard SQL DDL/DML commands/syntax for Cassandra query, like-
    • Drop table 
    • Update
    • Delete
    • Truncate
    • Select query

    2. DataStax OpsCenter- 


    3. DataStax DevCenter

        Installation: DataStax DevCenter is a visual CQL query tool for Cassandra and DataStax Enterprise.

        How to start OpsCenter GUI: http://www.datastax.com/documentation/getting_started/doc/getting_started/gettingStartedInstallOpscTar_t.html

2. Package Installation:

DataStax All-in-One Installer

How to run DataStax?

  • Cassandra:
    • Go to your /Users//dse via Terminal and execute the following command:
      • sudo ./bin/dse cassandra -f (This will start the cassandra)
  • Opscenter and DataStax-Agent:
    • Go to /Users//dse in new Tab and execute command : 
      • sudo ./opscenter/bin/opscenter -f (This will start the opscenter)
    • Go to /Users//dse in new Tab and execute command :
      • sudo ./datastax-agent/bin/datastax-agent -f (This will start the datastax-agent)

Using DataStax:

  • Now you can see the Opscenter in Browser in following address:
    • localhost:8888/
  • Here You can able to see the your Cassandra cluster visually, (Kind of monitoring tool for cassandra)


Integrate Cassandra with Spring Data Cassandra :

  1. /**
     * Created by: Rajiv Srivastava
    @ComponentScan(basePackages = {com.cassandraproject.dao,com.cassandraproject.utils})
    @EnableCassandraRepositories(basePackages ={com.cassandraproject.repository})
    public class AuditCoreContextConfig extends AbstractCassandraConfiguration {
     protected String getKeyspaceName() {
     return event_owner; //Schema or Keyspace name
     protected String getContactPoints() {
     return localhost;//IP address of server/local machine. Host of a clusters can be separated with comma (,) like host1,host2. Also minimum two host should be added, so that second Cassandra server will be connected if first is down.
     protected int getPort() {
     return 9042; //Cassandra DB port

2. Data Modelling

 a. Primary/Clustered/Partioned key

 * Created by: Rajiv Srivastava

/* Keyspac/Schema- event_owner
 * CREATE TABLE event_owner.event_audit (
 ctg text,
 month timestamp,
 ctgid text,
 ver timeuuid,
 userid text,
 action text,
 content text,
 PRIMARY KEY ((ctg,month),ctgid,ver)


public class EventAuditKey implements Serializable {

private static final long serialVersionUID = 1L;

@PrimaryKeyColumn(name = ctg, ordinal = 0, type = PrimaryKeyType.PARTITIONED)
 private String category;

 @PrimaryKeyColumn(name = month, ordinal = 1, type = PrimaryKeyType.PARTITIONED)
 private Date month;

 @PrimaryKeyColumn(name = ctgid, ordinal = 2, type = PrimaryKeyType.CLUSTERED, ordering =Ordering.ASCENDING)
 private String categoryId;

 @PrimaryKeyColumn(name = ver, ordinal = 3, type = PrimaryKeyType.CLUSTERED, ordering = Ordering.DESCENDING)
 private UUID version;

 3. Repository

 * Created by: Rajiv Srivastava*/
public interface AuditRepository extends CrudRepository <EventAudit, EventAuditKey> {

	@Query(select * from event_owner.event_audit)
	public List<EventAudit&amp> eventAudit();

4. CRUD Operation using JPA/CrudRepository – DAO Layer

 * Created by: Rajiv Srivastava*/
public class EventDaoImpl implements EventAuditDao {

 public AuditRepository auditRepository;

 public void save(EventAudit entity) {


 public void save(Collection<EventAudit> entities) {

 public EventAudit find(EventAuditKey eventAuditKey) {
 throw new IllegalArgumentException(It doesn't has all required instance variable set);
 return auditRepository.findOne(eventAuditKey);

 public List<EventAudit> getAll() {

 Iterable<EventAudit> iterable=auditRepository.findAll();
 if(null != iterable.iterator()){
 return Lists.newArrayList(iterable.iterator());
 return new ArrayList<>();

 public List<EventAudit> getListEventAuditMonthCategoryWise(Date date, String Category) {
 // TODO Auto-generated method stub
 return null;

Apache Cassandra: 

  * Getting started: http://wiki.apache.org/cassandra/GettingStarted

  * Join us in #cassandra on irc.freenode.net and ask questions

  * Subscribe to the Users mailing list by sending a mail to


  * Planet Cassandra aggregates Cassandra articles and news:


For more on what commands are supported by CQL, see

https://github.com/apache/cassandra/blob/trunk/doc/cql3/CQL.textile.  A

reasonable way to think of it is as, “SQL minus joins and subqueries.”

MongoDB + Java + Maven Sample Application

Prerequisite: Download and Install MongoDB (http://docs.mongodb.org/manual/installation/) on your local machine and run two instances in two separate terminal/command prompt. This sample app is developed on Mac OS/Linux env.

1. Start MongoDB server using : ./mongod

2. Start MongoDB client : ./mongo

Add MongoDB Jar dependency in your project:



Sample MONGODB CRUD code:

package com.db;
import java.util.List;
import java.util.Set;

import com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBCursor;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;

public class MongoConnectionManager {
	static DBCursor cursor;

	public static void main(String[] args) {

		System.out.println("Connecting with MongoDB......");
		//DB connection after ver 2.10.0
		MongoClient mongoconn = new MongoClient("localhost",27017 );
		//Get database
		DB db = mongoconn.getDB( "test" );
		System.out.println("Database Name: "+db.getName());

		//Display all databases
		List<String> dbs = mongoconn.getDatabaseNames();
		for(String mydb : dbs){

		//If MongoDB is installed on Secure mode
		//boolean auth = db.authenticate("username", "password".toCharArray());

		//Get Collection/tables names
		Set<String> colls = db.getCollectionNames();

		for (String s : colls) {

		//Get Collection

		DBCollection coll = db.getCollection("mynames");
		coll.createIndex(new BasicDBObject("email", 1));  // create index on "i", ascending

		System.out.printf("Collection Names:"+db.getCollectionNames());

		//Drop collections

		//Insert new records/documents
		BasicDBObject doc1 = new BasicDBObject();

		//Insert values into document/record
        doc1.put("name", "Rajiv");
        doc1.put("email", "rajivmca2004@yahoo.com");

     // Insert values in Collection

        BasicDBObject doc2 = new BasicDBObject();

		//Insert values into document/record
        doc2.put("name", "Mohit");
        doc2.put("email", "mohit@yahoo.com");

       // Insert values in Collection

        //Fecth only first doc

        DBObject mydoc= coll.findOne();
        System.out.println("Only First Doc :"+mydoc);

        //Insert and fetch Multiple doc

        for (int i = 0; i < 10; i++) {
			coll.insert(new BasicDBObject().append("count", i));

        //Total documents
       cursor= coll.find();


        //Getting A Single Document with A Query
        BasicDBObject query= new BasicDBObject();
        cursor= coll.find(query);
        System.out.print("Getting A Single Document with A Query: \n");

        //Iterate over database results

        //Conditional Queries, documents where i > 50. We could also get a range, say 20 < i <= 30 :
        query = new BasicDBObject();
        query.put("count", new BasicDBObject("$gt", 1).append("$lte", 30));  // i.e.   20 < i <= 30

        cursor = coll.find(query);

        while(cursor.hasNext()) {
            System.out.println("Comparison =>"+cursor.next());
        System.out.println("Total documents: "+coll.getCount());

		} catch (Exception e) {



Designing Modern Enterprise Search Solution –Part 1

This article would be a two part series.

  1. Part 1 – 5 requirements that will force you to adopt an enterprise search engine.
  2. Part 2– Addressing Enterprise search requirements with open source search platforms.


Part 1 – 5 requirements that will force you to adopt an enterprise search engine.

  1. Search over diverse content sources- An enterprise may need a search over diverse content sources including databases, document store, emails, JIRA, SharePoint and content management systems. A smart search solution should be capable enough to crawl these sources and make content searchable.
  2. Modern Search features- Now a days google has become benchmark for advance search features like Autosuggest, Did you mean “AAA”.
  3. Support powerful UI- To build a powerful search interface faceting (filters for search results) is must. I would be better if search engine provide responses in JSON so that
  4. Performance- For e.g. most of the searches below 100 ms.
  5. Full text search- With full text, search engine just do not match user specified term with 100% precision like “SQL like”. A full text search gives you flexibility to tune down the precision for better recall. Three full text realted concepta are  tokenization, stemming and ranking. Tokenization is Splitting a text into words, phrases and tokens. Generally it is done based on language grammar.





Implement Google Signer to implement Google Map Search

import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URISyntaxException;
import java.net.URL;
import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;
import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
import org.springframework.stereotype.Service;
import com.test.gcrmws.core.context.GCRMWSContext;


public class GoogleUrlSigner {

public String getLicnsedGeoCoderURL(String address,String inputUrl, String inputKey, String clientId)
throws InvalidKeyException, NoSuchAlgorithmException,
URISyntaxException {
 String request = null;
 URL url = null;

try {


byte[] key = convertBase64toBinary(inputKey);
url = new URL(inputUrl);

request = signRequest(url.getPath(), url.getQuery(),key);

} catch (MalformedURLException e) {
} catch (UnsupportedEncodingException e) {
return request;

private byte[] convertBase64toBinary(String keyString) {

// Convert the key from 'web safe' base 64 to binary

keyString = keyString.replace('-', '+');
keyString = keyString.replace('_', '/');
byte[] key = Base64.decode(keyString);
return key;

private String signRequest(String path, String query,byte[] key)
throws NoSuchAlgorithmException, InvalidKeyException,

UnsupportedEncodingException, URISyntaxException {
// Retrieve the proper URL components to sign
String resource = path + '?' + query;
// Get an HMAC-SHA1 signing key from the raw key bytes
SecretKeySpec sha1Key = new SecretKeySpec(key, "HmacSHA1");
// Get an HMAC-SHA1 Mac instance and initialize it with the HMAC-SHA1
// key
Mac mac = Mac.getInstance("HmacSHA1");

// compute the binary signature for the request
byte[] sigBytes = mac.doFinal(resource.getBytes());
// base 64 encode the binary signature
// String signature = Base64.encodeBytes(sigBytes);
String signature = Base64.encodeToString(sigBytes, true);
// convert the signature to 'web safe' base 64
signature = signature.replace('+', '-');
signature = signature.replace('/', '_');
return signature;

Main class – How to call this above method:


public it.units.GoogleGeocoding.GeocodeResponse getLocation(
StringBuilder geocodingIP, GoogleUrlSigner googleUrlSigner,

@GWSSuppressLogging Map<String, String> grlSyncLogMap,
RestConnector googleMapRestConnector, String addr,String... addressElements) {

Map<String, String> queryParams = new LinkedHashMap<String, String>();

GeocodeResponse geoResponse = null;

   String signature = null;
   StringBuilder address = new StringBuilder();

for (String addrelem : addressElements) {
  if (address.length() > 0) {

try {

address.append(URLEncoder.encode(addrelem, "UTF-8").replace("+", "%20"));
} catch (UnsupportedEncodingException e) {


Map<Param, String> map = googleMapRestConnector.getConfig();
String endpoint = java.text.MessageFormat.format(
map.get(Param.ENDPOINT_URL), geocodingIP);
map.put(Param.ENDPOINT_URL, endpoint);

try {

signature = googleUrlSigner.getLicnsedGeoCoderURL(
address.toString(), endpoint,
} catch (InvalidKeyException e) {

} catch (NoSuchAlgorithmException e) {
} catch (URISyntaxException e) {

queryParams.put("sensor", "false");
queryParams.put("address", addr);
queryParams.put("client", grlSyncLogMap.get("google.geocoder.client"));
queryParams.put("signature", signature);

try {
geoResponse = (it.units.GoogleGeocoding.GeocodeResponse) googleMapRestConnector.invoke(HttpMethod.GET, null, null, queryParams);


* System.out.println("LocationType: "+

* geoResponse.getResult().getGeometry().getLocationType());

* System.out.println("FormattedAddress: "+

* geoResponse.getResult().getFormattedAddress());

* System.out.println("AddressTypes: "+

* geoResponse.getResult().getAddressTypes());

* System.out.println("Lat: "+

* geoResponse.getResult().getGeometry().getLocation().getLat());

* System.out.println("Lan: "+

* geoResponse.getResult().getGeometry().getLocation().getLng());


} catch (NullPointerException ne) {
} catch (RestException re) {
} catch (Exception e) {
return geoResponse;


Git vs SVN – Advantages of GIT

  • Integration– Git has very flexible integration with Stash(GIt UI), Bamboo (Build tool) , JIRA (Agile). We can associate feature (a separate branch of code for a given story) branch with any JIRA story ( part of a sprint) by creating branch from Stash. So that all code changes/commits can be easily audit/trackable. Bamboo create build and show status of build (pass or fail) to Stash/Git and JIRA board.
  • Cloning – GIT creates a separate mirror branch for a minor change.
  • Easy switching between different branches- feature, develop, release, master, tag with the same set of local code or same folder. No need to replicate like SVN. Code merge and roll back is also easy and quicker.
  • Forking – A new project can be created outside of your project space by external team. Other team can work in parallel and merge their code like XYZ team has their own developers and development process, however they can work on same set of code and later on merge and release using same original remote repository.
  • Easy Code Review– We can do peer review by using Git/Stash tool by creating pull request. There we have set rules like- at least 2 approvals,1 successful build etc.
  • Easy to manage various branches and code merge is easy by its own merge tool/console commands.
  • Light wight and faster than SVN.

How HashMap works in Java

Note: Original article – http://javarevisited.blogspot.com/2011/02/how-hashmap-works-in-java.html

How HashMap works in Java

How HashMap works in Java or sometime how get method work in HashMap is common interview questions now days. Almost everybody who worked in Java knows what hashMap is, where to use hashMap or difference between hashtable and HashMap then why this interview question becomes so special? Because of the breadth and depth this question offers. It has become very popular java interview question in almost any senior or mid-senior level java interviews.

Questions start with simple statement 

“Have you used HashMap before” or “What is HashMap? Why do we use it “

Almost everybody answers this with yes and then interviewee keep talking about common facts about hashMap like hashMap accpt null while hashtable doesn’t, HashMap is not synchronized, hashMap is fast and so on along with basics like its stores key and value pairs etc.

This shows that person has used hashMap and quite familiar with the functionality HashMap offers but interview takes a sharp turn from here and next set of follow up questions gets more detailed about fundamentals involved in hashmap. Interview here you and come back with questions like

“Do you Know how hashMap works in Java” or

“How does get () method of HashMap works in Java”

And then you get answers like I don’t bother its standard Java API, you better look code on java; I can find it out in Google at any time etc.

But some interviewee definitely answer this and will say “HashMap works on principle of hashing, we have put () and get () method for storing and retrieving data from hashMap. When we pass an object to put () method to store it on hashMap, hashMap implementation calls

hashcode() method hashMap key object and by applying that hashcode on its own hashing funtion it identifies a bucket location for storing value object , important part here is HashMap stores both key+value in bucket which is essential to understand the retrieving logic. if people fails to recognize this and say it only stores Value in the bucket they will fail to explain the retrieving logic of any object stored in HashMap . This answer is very much acceptable and does make sense that interviewee has fair bit of knowledge how hashing works and how HashMap works in Java.

But this is just start of story and going forward when depth increases a little bit and when you put interviewee on scenarios every java developers faced day by day basis. So next question would be more likely about collision detection and collision resolution in Java HashMap e.g 

“What will happen if two different objects have same hashcode?”

Now from here confusion starts some time interviewer will say that since Hashcode is equal objects are equal and HashMap will throw exception or not store it again etc. then you might want to remind them about equals and hashCode() contract that two unequal object in Java very much can have equal hashcode. Some will give up at this point and some will move ahead and say “Since hashcode () is same, bucket location would be same and collision occurs in hashMap, Since HashMap use a linked list to store in bucket, value object will be stored in next node of linked list.” great this answer make sense to me though there could be some other collision resolution methods available this is simplest and HashMap does follow this.

But story does not end here and final questions interviewer ask like 

“How will you retreive if two different objects have same hashcode?”

Interviewee will say we will call get() method and then HashMap uses keys hashcode to find out bucket location and retrieves object but then you need to remind him that there are two objects are stored in same bucket , so they will say about traversal in linked list until we find the value object , then you ask how do you identify value object because you don’t value object to compare ,So until they know that HashMap stores both Key and Value in linked list node they won’t be able to resolve this issue and will try and fail.

But those bunch of people who remember this key information will say that after finding bucket location , we will call keys.equals() method to identify correct node in linked list and return associated value object for that key in Java HashMap. Perfect this is the correct answer.

In many cases interviewee fails at this stage because they get confused between hashcode () and equals ()and keys and values object in hashMap which is pretty obvious because they are dealing with the hashcode () in all previous questions and equals () come in picture only in case of retrieving value object from HashMap.

Some good developer point out here that using immutable, final object with proper equals () and hashcode () implementation would act as perfect Java HashMap keys and improve performance of Java hashMap by reducing collision. Immutability also allows caching there hashcode of different keys which makes overall retrieval process very fast and suggest that String and various wrapper classes e.g Integer provided by Java Collection API are very good HashMap keys.

Now if you clear all this java hashmap interview question you will be surprised by this very interesting question “What happens On HashMap in Java if the size of the Hashmap exceeds a given threshold defined by load factor ?”. Until you know how hashmap works exactly you won’t be able to answer this question.
if the size of the map exceeds a given threshold defined by load-factor e.g. if load factor is .75 it will act to re-size the map once it filled 75%. Java Hashmap does that by creating another new bucket array of size twice of previous size of hashmap, and then start putting every old element into that new bucket array and this process is called rehashing because it also applies hash function to find new bucket location. 

If you manage to answer this question on hashmap in java you will be greeted by “do you see any problem with resizing of hashmap in Java” , you might not be able to pick the context and then he will try to give you hint about multiple thread accessing the java hashmap and potentially looking for race condition on HashMap in Java

So the answer is Yes there is potential race condition exists while resizing hashmap in Java, if two thread at the same time found that now Java Hashmap needs resizing and they both try to resizing. on the process of resizing of hashmap in Java , the element in bucket which is stored in linked list get reversed in order during there migration to new bucket because java hashmap doesn’t append the new element at tail instead it append new element at head to avoid tail traversing. if race condition happens then you will end up with an infinite loop. though this point you can potentially argue that what the hell makes you think to use HashMap in multi-threaded environment to interviewer 🙂

I like this question because of its depth and number of concept it touches indirectly, if you look at questions asked during interview this HashMap questions has verified

Concept of hashing

Collision resolution in HashMap

Use of equals () and hashCode () method and there importance?

Benefit of immutable object?

race condition on hashmap in Java

Resizing of Java HashMap

Just to summarize here are the answers which does makes sense for above questions

How HashMAp works in Java

HashMap works on principle of hashing, we have put () and get () method for storing and retrieving object form hashMap.When we pass an both key and value to put() method to store on HashMap, it uses key object hashcode() method to calculate hashcode and they by applying hashing on that hashcode it identifies bucket location for storing value object.

While retrieving it uses key object equals method to find out correct key value pair and return value object associated with that key. HashMap uses linked list in case of collision and object will be stored in next node of linked list.

Also hashMap stores both key+value tuple in every node of linked list.

What will happen if two different HashMap key objects have same hashcode?

They will be stored in same bucket but no next node of linked list. And keys equals () method will be used to identify correct key value pair in HashMap.

In terms of usage HashMap is very versatile and I have mostly used hashMap as cache in electronic trading application I have worked . Since finance domain used Java heavily and due to performance reason we need caching a lot HashMap comes as very handy there.

to check some article on hashMap see here 

 Use of ConcurrentHashMap

 SynchrnozedHashMap and ConcurrentHashMap

Difference between hashtable and hashMap

Read more: http://javarevisited.blogspot.com/2011/02/how-hashmap-works-in-java.html#ixzz1yS9tC2fw