Monday, November 16, 2009

Things to Consider When Buying Data Warehouse Tools

Summary: Data warehousing tools are the tools that support data warehouse functions in different stages. These are basically software applications needed for the ETL (Extract, Transform, Load) process. These tools extract and transform data from operational systems and then help load that data into the data warehouse that further assists managers or users of an organization in the business decision-making process. The ETL process involves very complex activities. In order to facilitate the ETL process it is important to employ the right data warehouse tools.

Companies that consider having their data warehouses may buy data warehouse tools from third parties or they can develop their own tools. For this job they often engage in-house programmers. However, when data transformation requirements seem challenging companies are recommended to seek the help of third party tools that would be more advantageous. But when buying data warehousing management tools a few things should be kept in mind. Below are a few aspects that need your attention:

- Functional capability: The tools that you choose should be able to handle both the transformation as well as cleansing part of a data-warehousing project. If a tool has strong capability for both the tasks, then you may consider buying it. It is very important for a data warehouse tool to have strong capability.

- Ability to read directly from data source: A tool should have the ability to read directly from the data source. As we all know a data warehouse gets its data from varied sources and when a tool has this ability it would make the processing faster and more efficient.

- Metadata support: A warehouse management tool must have the capability to handle metadata. This aspect is very important because metadata of a data warehouse is used to map the source data to its destination.

There are plenty of data warehouse management tools developers in the market. To make your search for ETL tools easier below is a piece of information about popular ETL tools:

- The IBM WebSphere DataStage is an ETL tool, formerly known as Ardent DataStage and Ascential DataStage. This is a part of the IBM WebSphere Information Integration suite and the IBM Information Server, which is very easy to use, thanks to its visual interface. The tool is available in many versions including the Server Edition and the Enterprise Edition.

- Business Objects is a French company known for its enterprise software products. Its Data Integrator, integration and ETL tool is a popular product that was previously known as Acta. The ETL tool features the Data Integrator Job Server and the Data Integrator Designer.

- The Ab Initio is software developed by Ab Initio Software Corporation. It is a fourth generation data analysis, data manipulation, batch processing, GUI-based parallel processing ETL tool that comes as a suite of products which include the Co-Operating System, The Component Library, Graphical Development Environment, Enterprise Meta Environment and Data Profiler.

- SQL Server Integration Services (SSIS) is an ETL tool that provides a good platform to build data integration, workflow applications and data warehouse management. It is a component of Microsoft SQL Server.

This article was written by Brian May who has worked with companies that offer data warehousing consultants. He truly understands the value that a data warehousing can offer.

Article Source: http://EzineArticles.com/?expert=Brian_May

Online Analytical Processing (OLAP) for Data Warehousing

Summary: Data warehouses have played a very important role in organizational settings in the recent times. These can be used for sophisticated enterprise intelligence systems that process queries required to discover trends and analyze critical factors in the marketplace. These systems are known as online analytical processing (OLAP) systems. OLAP systems help designers organize data in the warehouse distinctively. The data in data warehouses is organized differently than in traditional transaction processing databases.

OLAP systems are designed in an intention to handle the queries in an organization required to discover trends and critical factors. This type of queries basically requires large amounts of data. OLAP data is always organized into multidimensional cubes. In other words an OLAP structure created from the operational data is called an OLAP cube. The cube is created from a start schema of tables. In this type of schema, the fact table is placed at the center and linked to numerous dimension tables. The fact table contains the core facts, which make up the query. Dimension tables indicate how the aggregations of relational data can be analyzed.

The multidimensional cube structure of data gives better performance for OLAP queries as compared to the structure where data is organized in relational tables. The basic unit of a multidimensional cube is called a measure. Measures are the units of data that are being analyzed. Take the example of a corporation that operates hardware stores. Suppose it wants to analyze revenue and discounts for the different products it sells. In this case, the measures would be the number of units sold, revenue and the sum of any discounts. These measures are organized along dimensions. A three dimensional cube in this example would have time, store and products as the three dimensions.

Further, each dimension is divided into units called members and the members of a dimension are typically organized into a hierarchy. Similar members are then grouped together as a level of the hierarchy. For example, the top hierarchy level of a time dimension can be years, with months at the next level, then weeks, days and finally hours at the bottom level of the hierarchy. At each intersection of the three dimensions, the values for the measures that match those three dimension values are recorded.

When it comes to the specific dimensions and measures for the cubes in an OLAP system, the kinds of analysis come across as an important aspect. An OLAP system operates on OLAP data in data warehouses. The reason behind using OLAP in data warehousing is speed. OLAP systems provide rapid access to large amounts of performance data from different viewpoints in order to assist business analysts and managers throughout an enterprise.

There are three types of OLAP- Multidimensional OLAP (MOLAP), Relational OLAP (ROLAP) and Hybrid OLAP (HOLAP), each with certain benefits. MOLAP uses a summary database and creates the required schema as a dimensional set of both base data and aggregations. ROLAP utilizes relational databases. Here the base data and the dimension tables are stored as relational tables and new tables are created to hold the aggregation information. Hybrid OLAP uses relational tables to hold base data and multi-dimensional tables to hold the speculative aggregations.

This article was written by Brian May who has worked with companies that offer data warehousing design. He truly understands the value that a data warehouse consulting can offer.

Article Source: http://EzineArticles.com/?expert=Brian_May

Online Analytical Processing (OLAP)

Summary: Online Analytical Processing is a methodology used to provide end users with access to large amounts of data in a rapid manner to assist with deductions based on investigative reasoning. OLAP uses multidimensional data representations, known as cubes to provide rapid access to data stored in data warehouses. In a data warehouse, cubes model data in the dimension and fact tables in order to provide sophisticated query and analysis capabilities to client applications. The software used in OLAP offers real-time analysis of data stored in a data warehouse. Generally, the OLAP server is a separate component that contains specialized algorithms and indexing tools that enable the processing of data mining tasks with minimal impact on database performance.

Online analytical processing is an integral part of businesses. It helps in the analysis and decision-making of an organization. For example, IT organizations often face the challenge of delivering systems that allow knowledge workers to make strategic and tactical decisions based on corporate information. These decision support systems are the OLAP systems that allow knowledge workers to intuitively, quickly and flexibly manipulate operational issues to provide analytical insight. Usually, OLAP systems are designed to:

- Support the complex analysis requirements of decision-makers.
- Analyze the data from a number of different perspectives (business dimensions).
- Support complex analysis against large input (atomic-level) data sets.

OLAP systems are generally designed based on two architectures- multidimensional OLAP (MOLAP) and relational OLAP (ROLAP). The MOLAP architecture utilizes a multidimensional database to provide analysis, while the ROLAP architecture access data directly from data warehouses. According to MOLAP architects OLAP is best implemented by storing data multi-dimensionally, whereas ROLAP architects like to believe that OLAP capabilities are best provided directly against the relational database. If we compare these two architectures of OLAP, we would come clear with that:

- Since ROLAP architecture is neutral to the amount of aggregation on the database, it leaves the design trade-off between query response time and batch processing requirements to the system designer. But MOLAP usually requires the databases to be pre-compiled in order to provide acceptable query performance in order to increase the batch processing requirements.

- ROLAP is suitable for dynamic consolidation of data for decision support analysis, while MOLAP is often favored for batch consolidation of data.

- ROLAP can scale to a large number of business analysis perspectives or dimensions, while MOLAP can generally perform efficiently with ten or fewer dimensions.

- ROLAP supports OLAP analysis against large volumes of input (atomic-level) data. But, MOLAP provides adequate performance only when the input data set is small (fewer than five gigabytes).

Online Analytical Processing is an interactive instrument for the analytic processing and data-recall facility in large databases. It allows rapid access to performance data from different viewpoints, to assist business analysts and managers throughout an enterprise.

This article was written by Brian May who has worked with companies that offer data warehousing consulting. He truly understands the value that a data warehousing can offer.

Article Source: http://EzineArticles.com/?expert=Brian_May

Efficiencies and Cost Saving Realized by Cloud Computing

Companies have built business plans by setting their sights on virtualizing, over time, all layers of their architecture for a premium benefit. Virtualizing resources, even if you do it over time, can create unprecedented cost savings and flexibility. It is also the reason why virtualization is a preliminary technology for Cloud computing.

Organizations taking advantage of these premium benefits are seeing real cost savings and efficiency results. Server savings have reached upwards of 70% of their total cost. The hardware costs have been reported to be reduced up to 70%, and maintenance costs reduced up to 50%. In regards to servers, you can see operation go anywhere between 10-30% all the way up to 65-70%. There is also a possibility of up to 30% in increased storage operation. You will also see a 25% annual reduction in acquiring new storage capacity. Adherently, there is an optimistic environmental financial impact in heating and cooling. Based on the specific infrastructure size, complexity and maturity results will vary, but these are strategies for companies to think about as they consider virtualizing their infrastructure.

Leveraging Virtualization To Deliver Cost Savings And Flexibility

There are three primary uses regarding virtualization. First, create a single point for the purpose of consolidation and simplification. Second, unify separate resources to operate together as a single resource. Third, dynamically adjust the resource allocation to optimize the infrastructure you may already have. It is widely recommended that companies choose consolidation efforts that make the most sense for their own infrastructure. Most companies that are at least doing server consolidation have started to concentrate on storage optimization. Technology consultants typically advise IT organizations to not only focus on their existing environment and servers that are in their infrastructure, but also look at targeted environments where the consolidation will happen.

Many companies start by considering not to involve any migration. This reduces the need for planning to be done on the target environment. This type of consolidation is beneficial, but it is also suggested that companies consider a "cross-pollination" approach. This approach provides an enhanced quality of service for the actual workload that's going to be placed on the organization's servers.

Challenges Companies Face To Virtualize Their Environment

While cloud computing is not a new concept for some companies, a large number of organizations, large and small, have not been versed in this type of technology. Many companies are seeing the huge potential data center virtualization services can offer, but they tend to struggle with concerns of not being able effectively implement and manage their virtualized environment. Also, not all workloads work well in a virtualized environment, so again, you have to strategize effectively so that problems will not occur in your infrastructure.

Information security and reliability is a major issue due to the ubiquitous presence of data in the daily lives of people in the emerging global economy. Data is quickly becoming extremely precious and sensitive, ranging from financial information, personal identifiers, and customer relationship management. Staying on the cutting edge of the latest reliability and security methods can be a major strain on an existing IT department.

Realize The Full Potential And Benefits Of Cloud Computing

To accelerate a system's virtualization infrastructure and discover the full potential and many benefits, target a company that specializes in consolidation and virtualization solutions. Make sure that they match your existing and new requirements for the right platform and optimize your infrastructure for both efficiency and performance. There are many technology companies that will give you examples of how they have witnessed and achieved great benefits from putting into place a strategic data center virtualization strategy and realized a huge return on investment.

Steve Bulmer is CTO and an author for Consonus, a provider of data center virtualization services. They offer cloud computing to suit any business need.

Article Source: http://EzineArticles.com/?expert=Steve_R._Bulmer