Friday, 24 August 2012

Emerging DB Technology – Columnar Database


Today’s Top Data-Management Challenge:

Businesses today are challenged by the ongoing explosion of data. Gartner is predicting data growth will exceed 650% over the next five years. Organizations capture, track, analyze and store everything from mass quantities of transactional, online and mobile data, to growing amounts of machine-generated data. In fact, machine-generated data, including sources ranging from web, telecom network and call-detail records, to data from online gaming, social networks, sensors, computer logs, satellites, financial transaction feeds and more, represents the fastest-growing category of Big Data. High volume web sites can generate billions of data entries every month.

As volumes expand into the tens of terabytes and even the petabyte range, IT departments are being pushed by end users to provide enhanced analytics and reporting against these ever increasing volumes of data. Managers need to be able to quickly understand this information, but, all too often, extracting useful intelligence can be like finding the proverbial ‘needle in the haystack.

How do columnar databases work?

The defining concept of a column-store is that the values of a table are stored contiguously by column. Thus the classic supplier table from supplier and parts database would be stored on disk or in memory something like:  S1S2S3S4S52010302030LondonParis Paris LondonAthensSmithJonesBlakeClarkAdams



This is in contrast to a traditional row-store which would store the data more like this:
S120LondonSmithS210Paris JonesS330Paris BlakeS420LondonClarkS530AthensAdams
From this simple concept flows all of the fundamental differences in performance, for better or worse, between a column-store and a row-store. For example, a column-store will excel at doing aggregations like totals and averages, but inserting a single row can be expensive, while the inverse holds true for row-stores. This should be apparent from the above diagram.

The Ubiquity of Thinking in Rows:

Organizing data in rows has been the standard approach for so long that it can seem like the only way to do it. An address list, a customer roster, and inventory information—you can just envision the neat row of fields and data going from left to right on your screen.

Databases such as Oracle, MS SQL Server, DB2 and MySQL are the best known row-based databases.
Row-based databases are ubiquitous because so many of our most important business systems are transactional.
Data Set Ex:  See the below data set contents of 20 columns X 50 Millions of Rows.


Example Data Set
Row-oriented databases are well suited for transactional environments, such as a call center where a customer’s entire record is required when their profile is retrieved and/or when fields are frequently updated.

Other examples include:
• Mail merging and customized emails
• Inventory transactions
• Billing and invoicing

Where row-based databases run into trouble is when they are used to handle analytic loads against large volumes of data, especially when user queries are dynamic and ad hoc.

To see why, let’s look at a database of sales transactions with 50-days of data and 1 million rows per day. Each row has 30 columns of data. So, this database has 30 columns and 50 million rows. Say you want to see how many toasters were sold for the third week of this period. A row-based database would return 7-million rows (1 million for each day of the third week) with 30 columns for each row—or 210-million data elements. That’s a lot of data elements to crunch to find out how many toasters were sold that week. As the data set increases in size, disk I/O becomes a substantial limiting factor since a row-oriented design forces the database to retrieve all column data for any query.

As we mentioned above, many companies try to solve this I/O problem by creating indices to optimize queries. This may work for routine reports (i.e. you always want to know how many toasters you sold for the third week of a reporting period) but there is a point of diminishing returns as load speed degrades since indices need to be recreated as data is added. In addition, users are severely limited in their ability to quickly do ad-hoc queries (i.e. how many toasters did we sell through our first Groupon offer? Should we do it again?) that can’t depend on indices to optimize results.


Pivoting Your Perspective: Columnar Technology

Column-oriented databases allow data to be stored column-by-column rather than row-by-row. This simple pivot in perspective—looking down rather than looking across—has profound implications for analytic speed. Column-oriented databases are better suited for analytics where, unlike transactions, only portions of each record are required. By grouping the data together this way, the database only needs to retrieve columns that are relevant to the query, greatly reducing the overall I/O.

Returning to the example in the section above, we see that a columnar database would not only eliminate
43 days of data, it would also eliminate 28 columns of data. Returning only the columns for toasters and units sold, the columnar database would return only 14 million data elements or 93% less data. By returning so much less data, columnar databases are much faster than row-based databases when analyzing large data sets. In addition, some columnar databases (such as Infobright®) compress data at high rates because each column stores a single data type (as opposed to rows that typically contain several data types), and allow compression to be optimized for each particular data type. Row-based databases have multiple data types and limitless range of values, thus making compression less efficient overall.

Thanks For Reading This Blog. View More:: BI Analytics

Performance Center Best Practices


For Performance Testing we have started using HP Performance Center due to many advantages it provides to the testing team. We have listed out some of the best practices which can be followed when using Performance Center.

Architecture – Best Practices

  • Hardware Considerations
    • CPU, Memory, Disk sized to match the role and usage levels
    • Redundancy added for growth accommodation and fault-tolerance
    • Never install multiple critical components on the same hardware
  • Network Considerations
    • Localization of all PC server traffic - Web to Database, Web to File Server, Web to Utility Server, Web to Controllers, Controller to Database, Controller to File Server, Controller to Utility Server.
    • Separation of operational and virtual user traffic – PC operational traffic should not share same network resources as virtual user traffic – for optimal network performance.
  • Backup and Recovery Considerations
    • Take periodic backup Oracle Database and File System (\\<fileserver>\LRFS)
  • Backups of PC servers and hosts are optional.
  • Monitoring Considerations
    • Monitor services (eg. SiteScope) should be employed to manage availability and responsiveness of each PC component

Configuration – Best Practice

  • Set ASP upload buffer to the maximum size of a file that you will permit to be uploaded to the server.
    • Registry: HKLM\SYSTEM\CurrentControlSet\Services\w3svc\Parameters
  • Modify MaxClientRequestBuffer
    • create as a DWORD if it does not exist)
    • Ex. 2097152 is 2 Mb
  • Limit access to the PC File System (LRFS) for security
    • Performance Center User (IUSR_METRO) needs “Full Control”
  • We recommend 2 LoadTest Web Servers when
    • Running 3 or more concurrent runs
    • Having 10 plus users viewing tests
  • The load balancing needs an external, web session based, load balancer
  • In Internet Explorer, set “Check for newer versions of stored pages” to “Every visit to the page”
    • NOTE: This should be done on the client machines that are accessing the Performance Center web sites

Script Repository – Best Practice

  • Use VuGen integration for direct script upload
  • Ensure dependent files are within zip file
  • Re-configure script with optimal RTS
  • Validate script execution on PC load generators
  • Establish meaningful script naming convention
  • Clean-up script repository regularly

Monitor Profile – Best Practice

  • Avoid information overload
    • Min-Max principle – Minimum metrics for maximum detection
  • Consult performance experts and developers for relevant metrics
    • Standard Process Metrics (CPU, Available Memory, Disk Read/Write Bytes, Network Bandwidth Utilization)
    • Response Times / Durations (Avg. Execution Time)
    • Rates and Frequencies (Gets/sec, Hard Parses/sec)
    • Queue Lengths (Requests Pending)
    • Finite Resource Consumption (JVM Available Heap Size, JDBC Pool’s Active Connections)
    • Error Frequency (Errors During Script Runtime, Errors/sec)

Load Test – Best Practice

  • General
  • Create new load test for any major change in scheduling logic or script types
  • Use versioning (by naming convention) to track changes
  • Scripts
  • When scripts are updated with new run-logic settings, remove and reinsert updated script in load test
  • Scheduling
  • Each ramp-up makes queries to Licensing (Utility) Server, and LRFS file system.  Do not ramp at intervals less than 5 seconds.
  • Configure ramp-up quantity per interval to match available load generators
  • Do not run (many/any) users on Controller

Timeslots – Best Practice

  • Scheduling
    • Always schedule time slots in advance of load test
    • Always schedule extra time (10-30 minutes) for large or critical load tests
    • Allow for gaps between scheduled test runs (in case of emergencies)
  • Host Selection
    • Use automatic host selection whenever possible
    • Reserve manual hosts only when specific hosts are needed (because of runtime configuration requirements)
The above mentioned solutions will help you to make use of Performance Center without any issues and will also save you a lot of time by avoiding some issues which might arise because of not doing some of the above mentioned practices.

Thanks For Reading This Block. Want To Know More Visit At: Performance Center Best Practices

Wednesday, 22 August 2012

Job: Peoplesoft Tester In Chennai

Title

Peoplesoft Tester

Categories

India

Grade

G4

Skill

Peoplesoft, HRMS Testing, Payroll

Start Date

21-08-2012

Location

Chennai

Job Information

3-5 years of experience in ERP Related Product Testing.

Knowledge of complete testing life-cycle and different testing methodologies.

Min. 2 – 3 years of hands on experience on PeopleSoft – HRMS.

Min. 1 year of experience on writing Test Scripts on PS Payroll Module.

Good knowledge on HP QC.

Strong analytical and troubleshooting skills.

Unit

10

 

Apply Now

Friday, 10 August 2012

Short-term contracts give mid-cap IT cos new lease of life

With the duration of outsourcing deals getting shorter, deals worth nearly USD 85 billion are up for renegotiations this year, reports CNBC-TV18’s Shreya Roy.

Shreya Roy, Reporter, CNBC TV18

Midcap IT players may get a new lease of life. With the duration of outsourcing deals getting shorter, deals worth nearly USD 85 billion are up for renegotiations this year, reports CNBC-TV18’s Shreya Roy.

Over the last few years, uncertain times have forced IT companies to go in for more short-term contracts. For mid-cap IT companies, this may have been a blessing in disguise.

Data from outsourcing advisory firm TPI says that around 700 contracts will be up for renegotiations this fiscal year, compared to 530 last year.

“There is a significant reduction in the tenure of contracts as they were originally signed. Compared to 10 years ago, when 500 of these were being done, there are 1000 a year. The tenure has gone down to five years instead of seven, so a lot of deals are naturally coming back to the market as renewals. In itself, this is a very large opportunity,” said Siddharth Pai, partner and MD at TPI India.

For many IT players, this may be just what the doctor ordered. After all, renewals account for almost 65% of the outsourcing market. Advisory firm Everest estimates that by October 2013, deals worth nearly USD 85 billion will be up for renewal.

These include a contract between HP and Bank of America, a mega deal from Shell group which is currently with AT&T, HP, and T-Systems, a blue cross blue shield deal with Dell and Manu Life's deal with IBM.

Many of these contracts are expected to be broken up into smaller chunks, as outsourcers are looking increasingly towards multi-sourcing. Analysts say this could work in the favour of the smaller players, especially those like Mindtree and Hexaware, which have been focusing on developing niche capabilities to help differentiate from larger players.


 

Tuesday, 7 August 2012

Hexaware Technologies :Riding High! --nirmal bang,

Riding High!
Hexaware Technologies Limited (HTL) is a mid-sized IT company mainly catering to the capital markets (BFSI) and the airline (transportation) sector. It also focuses on enterprise software provided by PeopleSoft and Oracle. Recent large client wins has bought back the focus on this company which has good expertise in the niche areas. 


Investment Rationale

 Improved Revenue visibility due to large wins in the past 5 quarters

The deal wins of over $ 625 mn which HTL has gained in the past 5 quarters is commendable. HTL’s efforts of mining the existing clients in the gloomy days are paying off now reflecting in the incremental revenue streams it has earned. These long term deals give enough revenue visibility for CY12. In addition, HTL is negotiating almost 4 deals above $25mn which are in the pipeline.

 Margins moving northwards – room for further heights
EBIDTA margins have improved 812 basis points in the past 5 quarters led by drastic control in the operating costs. The company has in addition utilized its offshorablity lever in its advantage by moving almost 14% of work offshore during the same period. Currently, onsite: offshore mix stands at 53:47, utilization in early 70’s and plans to hire freshers would further aid the margins going forward. We expect HTL to report EBIDTA margins of 20% + in CY12E and CY13E.

 Proficiency in niche segments paying off
HTL earns 60% of its revenues from the Capital Markets and Travels industries and almost 30% of revenues come from enterprise solutions in terms of its service lines. In enterprise solutions, 60-65% of its revenues are from PeopleSoft where other software vendor’s focus is less.

 Guidance Revision of 20% on USD revenues for CY12E

On the back of good deals won recently, the company has revised the revenue guidance in USD terms to 20%. We feel this is a little conservative and the company can easily beat the guidance for CY12E.
Valuation & Recommendation

We expect HTL’s revenues to grow at a CAGR of 25% and adjusted profits to grow at a CAGR of 21% over CY11-CY13E. Margin improvement would remain under focus and we expect HTL’s EBIDTA margins improving by 313bps to 21.2% in CY13E from 18.03% in CY11. At CMP, the stock is trading at 10.4x and 8.6x for CY12E and CY13E respectively. On the back of improved financials and good revenue visibility, we recommend a BUY on the stock, assigning a target multiple of 11x for CY13E EPS with a price target of Rs. 147 which is a potential 28% upside.

Risks to our Rationale:

 Concentration in Discretion spending Revenues

Hexaware derives more than 50% of its revenues from Enterprise solutions and Business Intelligence and Analytics which could get affected in economic downturn. However, the recent deal wins re-affirms the revenue visibility for the company for CY12E.

 Industry Risks of wage pressures, rupee appreciation and competition
Rupee depreciation has acted in favor of the company and Industry per say. Any severe reversal of the rupee trend would affect the prospects of the firm.

 Exposure in the European Region
The company has 28.4% exposure in the European region and few of the major deals have been signed with clients in this region. Looking at the current economic scenario prevailing in the Euro zone, any delay in commencement of these deals or cancellation may impact the margins severely.

Valuation & Recommendation
We expect HTL’s revenues to grow at a CAGR of 25% and adjusted profits to grow at a CAGR of 21% over CY11-CY13E. Margin improvement would remain under focus and we expect HTL’s EBIDTA margins improving by 313bps to 21.2% in CY13E from 18.03% in CY11. At CMP, the stock is trading at 10.4x and 8.6x for CY12E and CY13E respectively. On the back of improved financials and good revenue visibility, we recommend a BUY on the stock, assigning a target multiple of 11x for CY13E EPS with a price target of Rs. 147 which is a potential 28% upside.