Thursday, 29 March 2012

Strategies For Testing Data Warehouse Applications


Introduction:

There is an exponentially increasing cost associated with finding software defects later in the development lifecycle. In data warehousing, this is compounded because of the additional business costs of using incorrect data to make critical business decisions. Given the importance of early detection of software defects, let’s first review some general goals of testing an ETL application:

Below content describes the various common strategies used to test the Data warehouse system:
Data completeness: 

Ensures that all expected data is loaded in to target table.

1. Compare records counts between source and target..check for any rejected records.
2. Check Data should not be truncated in the column of target table.
3. Check unique values has to load in to the target. No duplicate records should be existing.
4. Check boundary value analysis (ex: only >=2008 year data has to load into the target)

Data Quality:

1.Number check: if in the source format of numbering the columns are as xx_30 but if the target is only 30 then it has to load not pre_fix(xx_) .. we need to validate.

2.  Date Check: They have to follow Date format and it should be same across all the records. Standard format : yyyy-mm-dd etc..

3. Precision Check: Precision value should display as expected in the target table.

Example: In source 19.123456 but in the target it should display as 19.123 or round of 20.

4.  Data Check: Based on business logic, few record which does not meet certain criteria should be filtered out.
Example: only record whose date_sid >=2008 and GLAccount != ‘CM001’ should only load in the
target table.

5. Null Check: Few columns should display “Null” based on business requirement
Example: Termination Date column should display null unless & until if his “Active status”
Column is “T” or “Deceased”.

Note: Data cleanness will be decided during design phase only.

Data cleanness:

Unnecessary columns should be deleted before loading into the staging area.

1.  Example: If a column have name but it is taking extra space , we have to “trim” space so before loading in the staging area with the help of expression transformation space will be trimed.

2. Example: Suppose telephone number and STD code in different columns and requirement says it should be in one column then with the help of expression transformation we will concatenate the values in one column.

Data Transformation: All the business logic implemented by using ETL-Transformation should reflect.

Integration testing:

Ensures that the ETL process functions well with other upstream and downstream processes.

Example:
1.  Downstream:Suppose if you are changing precision in one of the transformation “column”, let us assume a “EMPNO” is column having data type with size 16, this data type precision should be same for all transformation where ever this “EMPNO” column is used.

2.  Upstream: If the source is SAP/ BW and we are extracting data there will be ABAP code which will act as interface between SAP/ BW and map where there source is SAP /BW and to modify existing mapping we have to re-generate the ABAP code in the ETL tool (informatica)., if we don’t do it, wrong data will be extracted since ABAP code is not updated.

User-acceptance testing:

Ensures the solution meets users’ current expectations and anticipates their future expectations.
Example: Make sure none of the code should be hardcoded.

Regression testing:

Ensures existing functionality remains intact each time a new release of code is completed.

Conclusion:

Taking these considerations into account during the design and testing portions of building a data warehouse will ensure that a quality product is produced and prevent costly mistakes from being discovered in production.

Thursday, 15 March 2012

Automation Tool Selection Recommendation


  • Overview
  • Information Gathering
  • Tools and Vendors
  • Evaluation Criteria
  • Tools Evaluation
  • Matrix
  • Conclusion
  • Overview
“Automated Testing” means automating the manual testing process currently in use. This requires that a formalized “manual testing process” currently exists in the company or organization. Minimally, such a process includes:

–        Detailed test cases, including predictable “expected results”, which have been developed from Business Functional Specifications and Design documentation.

–        A standalone Test Environment, including a Test Database that is restorable to a known constant, such that the test cases are able to be repeated each time there are modifications made to the application.

Information Gathering

Following are sample questions asked to tester who have been using some the testing tools:

How long have you been using this tool and are you basically happy with it?

How many copies/licenses do you have and what hardware and software platforms are you using?

How did you evaluate and decide on this tool and which other tools did you consider before purchasing this tool?

How does the tool perform and are there any bottlenecks?

What is your impression of the vendor (commercial professionalism, on-going level of support, documentation and training)?

Tools and Vendors
  • Robot – Rational Software
  • WinRunner 7 – Mercury
  • QA Run 4.7 – Compuware
  • Visual Test – Rational Software
  • Silk Test – Segue
  • QA Wizard – Seapine Software
Tools Overview

Robot – Rational Software

–        IBM Rational Robot v2003 automates regression, functional and configuration testing for e-commerce, client/server and ERP Applications. It’s used to test applications constructed in a wide variety of IDEs and languages, and ships with IBM Rational TestManager. Rational TestManager provides desktop management of all testing activities for all types of testing.

WinRunner 7 – Mercury

–        Mercury WinRunner is a powerful tool for enterprise wide functional and regression testing.

–        WinRunner captures, verifies, and replays user interactions automatically to identify defects and ensure that business processes work flawlessly upon deployment and remain reliable.

–        WinRunner allows you to reduce testing time by automating repetitive tasks and optimize testing efforts by covering diverse environments with a single testing tool.

QA Run 4.7 – Compuware

–        With QA Run, programmers get the automation capabilities they need to quickly and productively create and execute test scripts, verify tests and analyze test results.

–        Uses an object-oriented approach to automate test script generation, which can significantly increase the accuracy of testing in the time you have available.

Visual Test 6.5 – Rational Software

–        Based on the BASIC language and used to simulate user actions on a User Interface.

–        Is a powerful language providing support for pointers, remote procedure calls, working with advanced data types such as linked lists, open-ended hash tables, callback functions, and much more.

–        Is a host of utilities for querying an application to determine how to access it with Visual Test, screen capture/comparison, script executor, and scenario recorder.

Silk Test – Segue

–        Is an automated tool for testing the functionality of enterprise applications in any environment.

–        Designed for ease of use, Silk Test includes a host of productivity-boosting features that let both novice and expert users create functional tests quickly, execute them automatically and analyze results accurately.

–        In addition to validating the full functionality of an application prior to its initial release, users can easily evaluate the impact of new enhancements on existing functionality by simply reusing existing test casts.

QA Wizard – Seapine Software

–        Completely automates the functional regression testing of your applications and Web sites.

–        It’s an intelligent object-based solution that provides data-driven testing support for multiple data sources.

–        Uses scripting language that includes all of the features of a modern structured language, including flow control, subroutines, constants, conditionals, variables, assignment statements, functions, and more.

Evaluation Criteria

Record and Playback         Object Mapping
Web Testing Object              Identity Tool
Environment Support        Extensible Language
Cost                                            Integration
Ease of Use                             Image Testing
Database Tests                     Test/Error Recovery
Data Functions                    Object Tests
Support

3 = Basic  2 = Good  1 = Excellent

Tool Selection Recommendation

Tool evaluation and selection is a project in its own right.

It can take between 2 and 6 weeks. It will need team members, a budget, goals and timescales.
There will also be people issues i.e. “politics”.

Start by looking at your current situation
– Identify your problems
– Explore alternative solutions
– Realistic expectations from tool solutions
– Are you ready for tools?

Make a business case for the tool

–What are your current and future manual testing costs?
–What are initial and future automated testing costs?
–What return will you get on investment and when?

Identify candidate tools

– Identify constraints (economic, environmental, commercial, quality, political)
– Classify tool features into mandatory & desirable
– Evaluate features by asking questions to tool vendors
– Investigate tool experience by asking questions to other tool users Plan and schedule in-house demonstration by vendors
– Make the decision

Choose a test tool that best fits the testing requirements of your organization or company.

An “Automated Testing Handbook” is available from the Software Testing Institute (www.ondaweb.com/sti), which covers all of the major considerations involved in choosing the right test tool for your purposes.

Wednesday, 7 March 2012

Performance Counters And Their Values For Performance Analysis


Performance Counters:
Performance counters are used to monitor system components such as processors, memory, network and the I/O devices. Performance counters are organized and grouped into performance counter categories. For instance the processor category contains all counters related to the operation of the processor such as the processor time, idle time, interrupt time and henceforth.  If performance counters are used in the application, they can publish performance-related data to compare them against acceptable criteria.
The number of counter parameters to be considered by the load tester/designers greatly varies based on the type and size of the application to be tested. Some of the Performance Counters and their Threshold values for Hexaware Performance Analysis are as follows:
Memory Counters:
Memory: Available Mbytes –This describes the amount of physical RAM available to processes running on the system.
Threshold to watch for:
Available Mbytes consistent value of less than 20 to 25 percent of installed RAM is an indication of insufficient memory. Values below 100 MB may indicate memory pressure.
Note: This counter displays the last observed value only. It is not an average.
Memory – Pages /sec-Indicates the rate at which pages are read from or written to disk to resolve hard page faults.
Threshold to watch for:
Memory-Pages /sec higher than 5 indicates a possible bottleneck
Process: Private Bytes: _Total -Indicates the current allocation of memory that cannot be shared with other processes. This counter can be used to identify memory leaks in.NET applications
Process: Working Set: _Total - This is the amount of physical memory being used by all processes combined. If the value for this counter is significantly below the value for Process: Private Bytes: _Total, it indicates that processes are paging too heavily. A difference of more than 10% is probably significant.
Processor Counters:
% Processor Time_Total Instance - Percentage of elapsed time a CPU is busy executing a non idle thread (An indicator or processor activity).
Threshold to watch for:
Processor % Time of sustained at or over 85% may indicate that processor performance (for that load) is the limiting factor.
% Privilege Time-Percent of threads running in privileged mode (file or network I/O, or allocate memory)
Threshold to watch for:
Processor % Privilege Time consistently over 75 percent indicates a bottleneck.
Processor Queue Length - Number of tasks ready to run than the processors can get to.
Threshold to watch for:
Processor Queue Length greater than 2 indicates a bottleneck.
Note: High values many not necessarily be bad for % Processor Time. However, if the other processor-related counters are increasing linearly such as % Privileged Time or Processor Queue Length, high CPU utilization may be worth investigating.
  • Less than 60% consumed = Healthy
  • 51% – 90% consumed = Monitor or Caution
  • 91% – 100% consumed = Critical or Out of Spec
System\Context Switches /sec. Occurs when higher priority threads preempts lower priority threads that are currently running, and can indicate when too many threads are competing for processor time. If much processor utilization is not seen and very low levels of context switching are seen, it could indicate that threads are blocked
Threshold to watch for:
As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.
Disk Counters:
Physical Disk (instance)\Disk Transfers/sec
To monitor disk activity, we can use this counter. When the measurement goes above 25 disk I/O’s per second then we got poor response time for the disk (which may well translate to a potential bottleneck. To further uncover the root cause we use the next mentioned counter.
Physical Disk (instance)\% Idle Time
This counter measures the percent time that the hard disk is idle during the measurement interval, and if we see this counter falling below 20% then we will likely get read/write requests queuing up for the disk which is unable to service these requests in a timely fashion. In this case it’s time to upgrade the hardware to use faster disks or scale out the application to better handle the load.
Avg. Disk sec/Transfer - The number of seconds it takes to complete one disk I/O.
Avg. Disk sec/Read - The average time, in seconds, of a read of data from the disk.
Avg. Disk sec/Write - The average time, in seconds, of a write of data to the disk.
Less than 10 msvery good
Between 10 – 20 msokay
Between 20 – 50 msslow, needs attention
Greater than 50 msserious I/O bottleneck
Note:  These three counters in the above list should consistently have values of approximately .020 (20 ms) or lower and should never exceed.050 (50 ms).
Source: Microsoft
Network Counters:
Network Interface: Output Queue Length - This is the number of packets in queue waiting to be sent. A bottleneck needs to be resolved if there is a sustained average of more than two packets in a queue.
Threshold to watch for:
If greater than 3 for 15 minutes or more, NIC (Network Interface Card) is bottleneck.
Network Segment: %Network Utilization - % of network bandwidth in use on this segment.
Threshold to watch for:
For Ethernet networks, if value is consistently about 50%-70%, this segment is becoming a bottleneck.
Conclusion : These values may not exactly depict the threshold limits but provides a consideration to be valued upon for Performance Analysis.

LoadRunner Runtime Settings – Multithreading Options


Performance testers are confronted with this classic dilemma when they decide to execute their script in LoadRunner. Whether to run the Vuser as a thread or as a process?

1.1  Difference between a thread and a process 

A Process

  • Let us consider a process as an independent entity or unit that has an exclusive virtual address space for itself.
  • A process can interact with another process only through IPC (inter process communication). More than one process could run at any given time but no two processes can share the same memory address space.
E.g. when we open an application say notepad from our Windows OS, we see that a notepad.exe process is displayed in our task manager under processes tab. If we open another such notepad a new notepad.exe process is displayed. This process has its own set of virtual address space.

A Thread

  • Threads are contained inside a process. More than one thread can exist within the same process and can share the memory space between them.
  • The advantage here is that multiple threads can share the same memory space. I.e. when a thread is in idle state another thread can utilize the resource thereby faster execution rate is achieved.
  • A memory space can be accessed by another thread if one thread remains idle for a long time.
  • Threads can also access common data structures if required.

1.2  Multithreading

While defining the runtime settings in LoadRunner, we have to choose between running the Vuser as a thread or a process. “The Controller uses a driver program (such as mdrv.exe or r3Vuser.exe) to run your Vusers. If you run each Vuser as a process, then the same driver program is launched (and loaded) into the memory again and again for every instance of the Vuser.” – LoadRunner User Guide. The driver program mentioned is nothing but a process that runs when we generate a Vuser load.

Runtime Settings


1.3  Run Vuser as a process – Disable Multithreading

  • If we choose the first option and run ‘n’ number of Vusers as a process, we will be able to see ‘n’ number of mmdrv.exe processes running in the Load generator machine. Each of this process would be consuming their own memory space.
  • When this option is selected, each of the Vuser process establishes at least one connection with the web/app server.

1.4  Run Vuser as a thread – Enable Multithreading

  • But we can choose to run the Vuser as a thread if we want to go easy on the resources. This way more number of Vusers can be generated with the same amount of available load generator memory.
  • When this option is selected, each of the Vuser thread can share the open connections between them (connection pooling). Opening and maintaining a connection for each Vuser process, is resource consuming. In connection pooling, the amount of time a user must wait to establish a connection to the database is also reduced.This is surely an advantage right? Wrong. The argument is that this is not an accurate replication of the user load - A single connection for each Vuser should be created like in a real time scenario and to achieve this we have to run Vuser as a process. There are other factors such as thread safety to be considered. When we run a large amount of Vusers as a single multi threaded process, the Vusers run as threads which share the same memory location. Thus one thread may impact, interfere or modify data elements of another thread posing serious thread safety concerns. Before selecting either of the options we need to determine the load generator capacity such as available system resources, memory space and also the thread safety of the protocols used.
Please Visit At: LoadRunner Runtime Settings For Know More.

Common Problems & Solutions For Performance Testing Flex Applications Using LoadRunner


This article lists the common problems & solutions that performance engineers come across when testing flex applications.

Problem #1 : Overlapped transmission error occurs when a flex script is run for the first time from controller. But the same script works fine in VuGen.

Error -27740: Overlapped transmission of request to “www.url.com” for URL“http://www.url.com/ProdApp/” failed: WSA_IO_PENDING.

Solution : The transmission of data to the server failed. It could be a network, router, or server problem. The word Overlapped refers to the way LoadRunner sends data in order to get a Web Page Breakdown. To resolve this problem, add the following statement to the beginning of the script to disable the breakdown of the “First Buffer” into server and network time.

web_set_sockets_option (“OVERLAPPED_SEND”, “0″);


Problem #2 : During script replay Vugen crashes due to mmdrv error. mmdrv has encountered a problem and needs to close. Additional details Error: mmdrv.exe caused an Microsoft C++ Exception in module kernel32.dll at 001B:7C81EB33, RaiseException () +0082 byte(s)

Solution : The cause of this issue is unknown. HP released a patch that can be downloaded from their site.

Problem #3 : AMF error: Failed to find request and response

Solution : LoadRunner web protocol has a mechanism to prevent large body requests to appear in the action files, by having the body in the lrw_custom_body.h. In AMF and Flex protocol, LR cannot handle these values and fails to generate the steps. Follow these steps to fix the problem:

1. Go to the “generation log”
2. Search for the highest value for “Content-Length”
3. Go to <LoadRunner installation folder>/config
4. Open vugen.ini
5. Add the following:
[WebRecorder]
BodySize=<size found in step (2)>
6. Regenerate the script

Problem #4 : There are duplicate AMF calls in the recording log as well as in the generated code.

Solution : Capture level may be set to Socket and WinInet, make sure under Recording Options –> Network –> Port Mapping –> Capture level is set to WinInet (only)

Problem #5 : A Flex script which has Flex_AMF and Flex_RTMP calls, on replay will have mismatch in the tree view between the request and the response. After looking in the replay log we can see that correct calls are being made but they are being displayed incorrectly in the tree view (only the replay in tree view is incorrect). Sometimes it shows the previous or next Flex_AMF call in the tree view in place of the Flex_RTMP call.

Solution : This issue has been identified as a bug by R&D in LR 9.51 and LR 9.52. R&D issued a new flexreplay.dll which resolved the issue and will be included in the next Service Pack.

Problem #6 : Flex protocol script fails with “Error: Encoding of AMF message failed” or “Error: Decoding of AMF message failed”

Solution : The cause for this error is the presence of special characters (&gt, &lt, &amp etc…) in the flex request. Send the request enclosed in CDATA Example: <firmName>XXXXXXX &amp; CO. INC.</firm Name> in script to <firmName><![CDATA[XXXXXXXXXXXX &amp; CO. INC.]]></firmName>

Problem #7 : When creating a Multi Protocol Script that contains FLEX and WEB protocols sometimes VuGen closes automatically without any warning/error message displayed. This happens when the Web protocol is set to be in HTML Mode. When in URL mode the crash does not occur. There is no error code except a generic Windows message stating the VuGen needs to close.

Solution : This issue can be seen on Machines that are running on Windows XP, and using Mfc80.dll. Refer to Microsoft KB Article in the link below that provides a solution for the same. Microsoft released a hot fix for Windows specific issue that can cause VuGen to close.
http://support.microsoft.com/kb/961894

Problem #8 : When recording a FLEX script, RTMP calls are not being captured correctly so the corresponding FLEX_RTMP_Connect functions are not generated in the script.

Solution : First set the Port Mapping (Choose Recording options –> Network –> Port Mapping –> set Capture Level to ‘Socket level and WinINet level data’) set to ‘Socket level and if this doesn’t help, follow the next step. Record a FLEX + Winsock script. In Port mapping section, Set the Send-Receive buffer size threshold to 1500 under the options. Create a new entry and select Service ID as SOCKET, enter the Port (such as 2037 or whatever port the FLEX application is using for connection), Connection Type as Plain, Record Type as Proxy, and Target Server can be the default value(Any Server).

Problem #9 : Replaying a Flex script containing a flex_rtmp_send() that has an XML argument string may result in the mmdrv process crashing with a failure in a Microsoft Dynamics.

Solution : The VuGen script generation functionality does not handle the XML parameter string within the function correctly. This results in the mmdrv process crashing during replay. If you have the 9.51 version, installing a specific patch (flex9.51rup.zip) or service pack 2 will resolve the problem

Problem #10 : During the test executions in controller, sometimes the scripts throw an error ‘Decoding of AMF message failed. Error is: Externalizable parsing failed’.

Solution : This is mostly due to the file transfer problem. It is always advised to place the jar files in a share path common to all load agents.

Other Flex Supported Load Testing Tools


There are other Commercial & Open Source tools available, that support the flex application testing. Some tools (For example, Neoload) have much considerable support for RTMP even when compared to LoadRunner. The way all these tools test the flex application is quite similar, each tool has its own AMF/XML conversion engine, which serializes the binary data to readable XML format
Open Source
  • Data Services Stress Testing Framework
  • JMeter
Commercial Tools
  • Silk Performer by Borland
  • NeoLoad by Neotys
  • WebLOAD by RadView
Performance Improvement Recommendations


When it comes to performance improvement of an application, our first concern would be to enhance the scalability for a specified hardware & software configuration.
  • In case of flex, the scalability issues derive from the fact that BlazeDS is deployed in a conventional Java Servlet container, and performance/scalability of BlazeDS also depends on the number of concurrent connections supported by the server such as Tomcat, WebSphere, Web Logic … BlazeDS runs in a servlet container, which maintains a thread pool.
  • Each thread is assigned to client request and returns to a reusable pool after the request is processed. When a particular client request uses a thread for a longer duration, the thread is being locked by that corresponding client till the request is processed. So the number of the concurrent users in BlazeDS depends on the number of threads a particular servlet container can hold.
  • While BlazeDS is preconfigured with just 10 simultaneous connections, it can be increased to several hundred, and the actual number depends on the server’s threading configuration, CPU and the size of its JVM heap memory. This number can also be affected by the number of messages processed by server in the unit of time and size of the messages.
  • Tomcat or WebSphere can support upto several hundred of users, and any servlet container that supports Servlets 3.0, BlazeDS can be used in the more demanding applications that require support of thousands concurrent users.
Based on our project experience in performance testing Flex applications using LoadRunner we have pointed out some of the common problems that might arise during performance testing Flex applications. This will save you a lot of time as we have also provided the solutions to troubleshoot the errors if they occur.

Thanks For Reading Blog. Know More About: Flex