Sunday 10 July 2011

Wednesday 29 December 2010

End User Experience

In the preceeding section on Monitoring, we discussed how monitoring is necessary to understand how our users are using our IT systems; and how it helps us evolve a judgement about future needs.

Let's go back to our shopkeeper example. What else does the shopkeeper do apart from noting the timings when customers come and go; and the types of fruits they buy? Have you seen how shopkeepers indulge in small talk with customers who are in a queue, befriending them and making them feel like it is their own shop; or how some of the most interesting / oft used and inexpensive items like chocolates & pens are stocked very near to the till in a mall, or how often a new till is opened if a queue gets long. You would have seen how sales representatives in a shopping mall are well informed about latest trends in cosmetics/ fashion/ electronics and can help the customer with making the right choice. Very clearly, the shopkeeper is trying to improve the customers feel-good factor in these cases. Why do they need to do it? Obviously, to keep their customers happy & to gain an edge with competition. I have often wondered how much this really improves sales in the shop; but, if shops do it; then it may well be. Like obviously, you'd prefer a shop that does do all these things over a shop that does not ! :-)

So then, like all our examples so far, we should apply the same principal to our IT systems.
We should try to understand how our end users are finding the front end - does it respond fast enough for them to be happy customers? What are the times when customers like to use the IT systems most? In peak times, are we allowing for a sufficient number of concurrent threads, so that queuing delays within our system do not bring down the happiness quotient for our customers?

Do we know what the happiness quotient of our customers is - in short do we know what SLAs form tolerable limits from our customer's view point?

Understanding which are the favourite transactions that customer's perform and using that knowledge to drive more business is what business analytics tools help with nowadays. Advertisements of new banking facilities/ interest rates for example could be displayed on pages where the customer is most likely to land during his most favourite or frequently done action / transaction like check balance, on the website of a bank for example.

IT systems may be able to sometimes read the mood of a customer based on the data that is being thrown up to the customer and provide content-sensitive feedback/ advertisements/ pointers to the customer.

End user experience is actually a lot more beyond "response time". It's not just about deploying a robotic monitoring tool and reading response time data from it to figure out how well responses are going back to the tool or deploying network monitoring or passive client agents. It's about the layout of the website, making it easy to navigate, friendly to specially abled people, making the customer feel comfortable, secure and at home; basically a joy to access the IT application and use it. To me, response time and business intelligence, analytics and CRM need to be viewed together; to really make a solid difference to the business & to improve the end user experience!

Tuesday 28 December 2010

Monitoring

So, let's say, we opened our fruit shop, and said "I would like to sell fruits to all the people in this locality"; and then went along to define "Volumes". How would we do it? We would make some assumptions about

  • Growing end-users year on year - Let's assume that there are a 100 households around this place where we want to sell fruits. So, we'll say 100 households; with an average member count per household to be around 4 people per home. Say we plan that we will expand to 150 households in year 2 and 300 households in year 3.
  • number of users using the system concurrently - Let's assume that the shop is in the midst of the location, so, we will have roughly 50% people going across this shop each day; and needing to refresh fruits every 3rd day of the week. That means, about 50 people every 3 days coming over to buy fruits. Since each home has roughly 4 people and the fruits need to last couple of days, it means about 600 fruits will be used every 3rd day.
  • the pattern of peaks and troughs that is expected - People will come over either in the morning on their way to work or in the evening on their way back from work. On weekends, the number of people coming in will be higher. Lets assume equal split in morning and evening; so 25% of the people will come over in morning and 25% in evening; thus giving us a peak of roughly 25 people coming in across the morning 30 mins perhaps; and on weekends, say 70% of the entire community turns up with their families randomly across the day; giving us about 70*4 = 280 ; say 30% of these come in at once sometime during the day; so it would be around 84 people in the shop at once; that's about 21 households, lesser than the daily peak, but still more people in the shop ! :-)

Well, we made these assumptions; but once we set shop, only then we will know the true pattern. And to take note of the true pattern, we need to "monitor".

It's easy to monitor as a shopkeeper; keep a pen and paper and keep noting the timings when people come in and go out and what they buy; so that the quantities & varieties of fruits can be adjusted over days and months; and years as customer base grows. Over time, the shopkeeper learns more and more and builds a good judgement about the behaviour /likes/dislikes of customers and is able to predict & judge exactly how much he will need to stock when he opens a new shop or begins operating in a new town. Well, if a simple shopkeeper "capacity plans" in this way; why couldn't an IT system be capacity planned & managed using the same principles?

In it's simplest form of implementation, this type of monitoring of users' usage of an IT system; is done by writing database fetches or queries that run overnight in quiet times; fetching timings of various user actions; and then doing a post analysis to know what users are actually doing. However, whilst a shopkeeper keeps track of which customers bought which fruits (and/or demanded for which fruits) to determine how many fruits to stock; an IT system needs to convert the user transaction data into units of computational resources used; i.e. understand how users' requests are consuming computational resources like CPU processing power, RAM, paging, network bandwidth, storage space and so on; to be able to estimate how many IT resources will be required to support that user requirement.

So, monitoring is required at various levels;

  • the first level is to know what are users doing; and how this aligns with the assumptions we made at the start when we opened the shop (i.e website/ application/ service in IT terms)
  • the second level is to know how does what users do impact computational resource usage;
  • the third level is to know how our user base is growing; and predict what that means eventually to our computational resource requirements.

In most real life IT scenarios; some of these three levels may be extremely mature and some could be missing. It's only when all three levels of monitoring are present that accurate and predictive capacity planning is possible.

There is no missing a key point - that monitoring and tracking alone provides a view of reality and how it aligns to assumptions made; the moment a difference from assumptions or past trends is observed; it is reason enough to trigger off a whole lot of questions to explain the causes behind the difference - this approach when followed religiously leads to amusing revelations of how systems work & their bottlenecks; but the key is to put the "appropriate" monitoring & tracking mechanisms in place upfront for all major IT systems to measure & quantify variances between reality & assumption. For eg. in our fruit seller example, if the seller did not note the number & types of fruits sold at various times of day and the choices of the people; and instead relied on noting the timestamp of when he ran out of his stock of apples or oranges each day; it would hardly help the purpose to stock enough and earn well (both the praise of the customers and money) !! It is quite important to first make the right assumptions and then to measure the right things; in line with the assumptions made.

Volumetric assumptions made at design stage & Monitoring in-life go hand in hand. Thus it is a very good idea to design the monitoring along with the system, during the design phase of an IT system; however this tends to happen only in a very mature performance engineering practice.

Friday 13 March 2009

Volumes!

As I read through my last blog on Performance Engineering just now, I realize that it was still not all that simplified anyway! I ended up using words and phrases like "capacity", "monitoring", "Volumes", "spikes of usage", "end user experience", "queuing principles", "queuing delay", "available"! Let me try to delve deeper into what these phrases mean.


In this edition let's focus on "Volumes".

When a software system is being built it is designed with a specific set of end users in mind. Like for example a fruit vendor decides that he will sell only apples and bananas, likewise, the designer of a software system builds the system to cater to the needs of a specific subset of end-users.

The fruit vendor has a choice, he could go door to door and sell; OR he could sit in a market and offer his fruits whilst customers came to buy them from him. Similarly, the software system could be a single user application like Wordpad or Notepad; i.e. an application used by a single user at a time; OR it could be one that can be accessed by multiple users at once; for example a chat application or a website.


The fruit vendor could decide that he wanted to sell 20 kgs of fruits each day. Similarly, when we define volumes for a system, in a software performance engineering context, we really aim to define the number of end users that will use the software application.


The fruit vendor would typically know that most people would buy fruits from him in the market while they are going to work or during lunch time or later in the evening when they are returning back home. Similarly, define the usage pattern of end users. For eg. If there are 10 users in a small office who will access the "attendance" system; it is very clear that the "attendance" system will receive most of the usage when users come into office in the morning and when users go home in the evening. All through the remaining part of the day the system is "idle".

Today when everything is web based and the world is a global village wherein 2 users of the same application sit in different corners of the world to access the same application, it is imperative that usage is understood from a global perspective. Now let's think that this small office is operating from 2 locations; 5 employees in UK and 5 employees in India. Time zones come into picture here and you will see the access pattern different - Indian employees will access this attendance system in the morning and evening as per India time and employees in UK will do likewise in UK time; thereby making the usage pattern quite different.


If the fruit vendor aims to build a chain of shops that he owns, he needs to understand how many people in different parts of the town would like to have his fruits! So then, taking this analogy forward, we aim to understand how the end users of a software application will grow. Going back to our attendance system as an example, maybe the office is aiming to expand into other parts of the world with more employees. So, what are these plans as per estimates given by the business?


As the total user base grows; the total number of users accessing the system concurrently also grows. Obviously, if the fruit vendor became popular he can expect more people in his store at once! Usually these are directly propotional quantities/ metrics.


So then, to summarize, when we define requirements for a software system, "Volumes" now form an important measure to quantify; and this measure can be defined in terms of
  • growing end-users year on year
  • number of users using the software system concurrently and
  • the pattern of peaks and troughs that is expected

Tuesday 24 February 2009

Performance Engineering!

More often than not, I am posed this question by so many people "What do you do?"
And I reply "I am a Performance Engineer"
"Oh, Software Engineer?" pat comes the reply.
"No, Performance Engineering is different" goes the response in my head; but I seldom respond like that.


Rather, I end up saying "Yes"; just to avoid having to answer some tougher questions; or rather to avoid having to think up tough answers to simple questions! The problem is not that the questions are tough; neither are the answers tough, the problem is how do I make the answers simple enough for a layman to understand?

So, let me start my Techie Insight by trying to do just that - Explain "Performance Engineering" as easily and simply as possible. In fact, it should not be that tough because we practice it everytime; in our day to day lives, without knowing it. Really? Indeed, let's check this out.

  1. When we see the level of sugar in the sugar can going down, we intuitively know that we have to go and refresh the contents otherwise we will end up with no sugar for tea one day.
  2. In fact, when we expect guests at home, we often check all the provisions at home to see we have enough to serve our guests well and that we will not run out of items at the wrong time.
  3. We are upset if we are unable to extend warm hospitality to our guests
  4. When we go to the bus stop; we usually check the queue that's moving faster and join into that queue.
  5. If there is an accident on the road, we see some delay while driving past that part of the road, because some part of the road is blocked off for investigations.
  6. We don't like to wait too long at the check-out counter in a super mall.
  7. We don't like it when fewer check-out terminals are "on" during the busy evening time at the super mall and/or if the vendor at the terminal is gossiping instead of servicing customers!
What's happening in all of the examples above is "Performance Engineering". We are engineering a better outcome for ourselves by taking either proactive or reactive steps to known and unknown problems.
Let's elaborate on every example above a little more:
  1. In example 1 above, we are managing the "capacity" of the provisions at home; by constantly "monitoring" the usage of the sugar in the container.
  2. In example 2 above, we are aware of the "volumes of usage" of provisions at home; we are aware that there will be "spikes in usage" when guests arrive and "plan proactively" for such spikes of usage.
  3. In example 3 above, we are concerned about "end user experience"
  4. In example 4 above, we are applying "queuing principles" and "entering the fastest moving queue" in an attempt to reach the head of the queue "fast" and avoid "longer queuing delay"
  5. In example 5 above, we are using the part of the road that is not blocked off for maintenance or investigation i.e. we are using the part of the road that is "available"
  6. In example 6, we know how we feel when we need to wait long - so again we are aware of "queuing delay"
  7. In example 7, we intuitively know that during a "peak period" in a mall, all terminals should be ON to avoid queues and promote quick service.
Well, the same principles mentioned above apply in "software performance engineering" and "network performance".
How? We'll see that in my next few blogs!

Sunday 25 January 2009

Techie Insight!

I've always felt the yearning to share technical tid-bits from experience and research.
I admire people who blog on latest technology hot topics and build opinion about upcoming advancements.

This is my dashboard to share my bit of techie insight with the world!

Welcome!
I hope to provide some good technically focussed blogs on Performance Engineering here!
Stay tuned and watch this space!