Thursday, 23 May 2013

MOLAP, ROLAP, and HOLAP in DWH

What is OLAP
OLAP, On Line Analytical Processing

It is of Mainly three types
1)MOLAP
2)ROLAP
3)HOLAP
and DOLAP 

MOLAP-
Multidimensional OLAP
The MOLAP environment consists of a multi-dimensional cube with all possible perspectives precalculated. Therefore MOLAP applications do have a high performance. At the same time the need a lot of disk capacity to store all the precalculated figures.

Advantages:
  • Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations.
  • Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly.
Disadvantages:
  • Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself.
  • Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed.
ROLAP

The ROLAP envirenment consist of a relational database with a OLAP query engine. Every information need is calculated on the fly by the OLAP query engine. therefore ROLAP mosttimes has a slower performance. Never the less, it needs less disk space to store the data.
This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement.
Advantages:
  • Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount.
  • Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities.
Disadvantages:
  • Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large.
  • Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions.
HOLAP

The HOLAP environment consists of both a relation database and a preconfigured multi-dimension cube. Most times you can be flexible in keeping information in the relational database and/or the preconfigured multi-dimensional cube.
HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data.

Row Denormalization in pentaho (PDI) Kettle

What is Denormalization in Data base Why we should we do this..?

=> This is a attempt to optimize the read performance of a database by adding redundant data or by grouping data.In some cases.

As normalized Form of data is Stored in different tables so retrieval of desired data based on certain condition Takes more logical resource ..
 so  to reduce the logical load we need to do demoralize the data.  

 pros    retrieval is fast
 cons   Takes more Disk Space and having data redundancy .

Example with pentaho PDI
---------------------------------------------------

select Row De-normalizer under Transformation Tab of PDI

now Feed Data to this step using table input or XLS Input
  Now
1)select key Field from drop down
2)press Get field Button To get group Fields
3)Slect Target Column,
4)Select Value Column,
5) Key Value

Now You can dump the desired Denormalized  data in in table or xls..
:)








Wednesday, 22 May 2013

How To Copy or Move Different type of Files From a Folder to Diffrernt Folders According to Type of files


For This :-

You will need a directory named sampleFiles containing a set of files with different extensions, including .txt and .xls. You will also need three destination directories,
Named relevant  to Type of files..
txtFiles, xlsFiles and OtherFiles.

Carry out the following steps:
1. Create a new job and drop a Start job entry into the canvas.

2. Add a Copy Files job entry. In this entry, you will add the directions for copying the files into the three available destination folders. Double-click on the entry to open it.

3. In the File/Folder source textbox, type or browse for the sampleFiles folder. In the File/Folder destination, type or browse for the txtFiles folder. Also, type .*\.txt in the Wildcard (regExp) textbox. Click on the Add button.

4. In the File/Folder source textbox, type or browse for the sampleFiles folder. In the File/Folder destination, type or browse for the xlsFiles folder. Also, type .*\.xls in the Wildcard (regExp) textbox. Click on the Add button.

5. In the File/Folder source textbox, type or browse for the sampleFiles folder. In
the File/Folder destination, type or browse for the OtherFiles folder. Also, type
.+(?<!(txt|xls))$ in the Wildcard (regExp) textbox. Click on the Add button.

6. Assuming that all folders are inside the directory where you have your job, the Files/ Folders grid will look like the following screenshot:
















Tuesday, 21 May 2013

Creating and Loading a parent-child table Structure in pentaho PDI

 parent-child table is a table in which there is a self-referencing relationship. In other words,there is a hierarchical relationship among its rows.

 A typical example of this is a table with employees, in which one of the columns contains references to the employee that is above each employee in the hierarchy.
Here By this Example you will load the parent-child table of employees of Steel Wheels (Get it in pentaho Sample DB).
The hierarchy
of roles in Steel Wheels is as follows:

1) A sales representative reports to a sales manager
2)A sales manager reports to a vice-president
3) A vice-presidents reports to the president
4) The president is the highest level in the hierarchy. There is a single employee
 with this role.

Now Start Doin...

Go to your Database And create "employee" table

Create a transformation that inserts the record for the president who is the first in the hierarchy, and doesn't report to anyone. The transformation should read the file, filter he record with JOBTITLE=President, and insert the data into the employees table.
2. Create another transformation to load the rest of the employees. Define a named parameter named LEVEL that will represent the role of the employees being loaded.
3. Use a Text file input step to read the file of employees.
4. Use a Get Variables step to add the variable LEVEL as a new field named
level.
5. Add a Filter rows step to filter the employees to load based on their role. In order to do that, enter the following condition: JOBTITLE REGEXP level.
6. Add a Database lookup to find out the employee number of the employee who is
one above in the hierarchy: In the upper grid add a row with the condition EMAIL = REP_TO. Use the lower grid to get the field EMPLOYEENUMBER and rename it to
REPORTSTO.
7.Add a Table Output step, and use it to insert the records in the table employees.Your final transformation looks like this:





 











8. Finally create a job to put all together. Drag to the work area a START entry, and four Transformation job entries. Link all of them in a row.
9. Use the first Transformation entry to execute the transformation that loads
the president.
10. Double-click the second Transformation entry and configure it to run the
transformation that loads the other employees. Under the Parameters tab, add a
parameter named LEVEL with value VP.*.






Working with Databases



11. Repeat step 10 for the third Transformation entry, but this time type
.*Manager.* as the value for the LEVEL parameter.


12. Repeat step 10 for the fourth Transformation entry, but this time type Sales
Rep.* as the value for the LEVEL parameter.

13. Save and run the job. The table should have all employees loaded,

 For more detail Ref:- Pentaho Data integration cook book
or wrire to me spectrumon@gmail.com








Tuesday, 14 May 2013

File not found error in pentaho excel report

It is due to very long blank space (Extra Size) of excel sheet , so reduce it up to its column only.
...
steps==>
1)Go to File -------> Page set up.

2)Go to Custom Width ...

3)Now Reduce it to size of ur last column...
example 1000 or 1500 .. depends on ur page size.

4) All Done Enjoy

Monday, 13 May 2013

How to use Slowly Changing Dimensions (SCD) in pentaho

What is SCD ?

Slowly changing dimensions are the dimensions in which the data changes slowly, rather than changing regularly on a time basis.

SCD Type 1: Overwrite
---------------------------------------

A type 1 slowly changing dimension is the most basic one and doesn’t
require any special modeling or additional fields.
SCD type 1 columns just get overwritten with new values when they come into the data warehouse.
Example
--------------------------------------------------------------------------------------------------------------------------
Customer _key        Customer _id     Customer _Name       Customer _City
    1                               22321                Abhishek                      Delhi
--------------------------------------------------------------------------------------------------------------------------

After applying SCD Type 1

---------------------------------------------------------------------------------------------------------------------------
 Customer _key          Customer _id      Customer _Name      Customer _City
       1                                22321                Abhishek                Banglore
-----------------------------------------------------------------------------------------------------------------------------


SCD Type 2: Add Row
---------------------------------------
keep historical Data as well as current data by adding one new row.

This method tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys and/or different version numbers.
Unlimited history is preserved for each insert.
 -----------------------------------------------------------------------------------------------------------------------
Supplier_Key  Supplier_Code Supplier_Name Supplier_State Version.
123   ABC Acme Supply Co CA 0
124    ABC Acme Supply Co IL 1
 ------------------------------------------------------------------------------------------------------------------------

SCD Type 3: Add Column
----------------------------------------------

This method tracks changes using separate columns and preserves limited history. 
The Type II preserves unlimited history as it's limited to the number of columns designated for storing historical data.
The original table structure in Type I and Type II is the same but Type III adds additional columns. In the following example, an additional column has been added to the to record the supplier's original state -
only the previous history is stored.

 -------------------------------------------------------------------------------------------------------------------------------

Supplier_Key Supplier_Code Supplier_Name    Original_Supplier_State Effective_Date    Current_Supplier_State
123 ABC Acme Supply Co          CA 22-Dec-2004              IL

 ---------------------------------------------------------------------------------------------------------------------------------






Pentaho Bi server configration on redhat linux and Mysql /oracle


A Guide On How To Setup


Pentaho BI Server 4.0 community edition
with MySQL & Linux(redhat) And Manual Deployment

By Abhishek Shankar
Pentaho BI Server 4.0 With MySQL And Linux


Overview
Introduction
This is a guide on how to install Pentaho Business Intelligence Server 4.0 community Edition with MySQL on a Linux
machine. This guide assumes you have some basic knowledge of Linux and MySQL.
This version of the guide is for a local setup:
Access to the database can only be made from the server that it is located on.
Access to the BI Server can be made from both the server and other PC's.
Access to the Administrator Console can be made from both the server and other PC's.
If you would like a guide for a remote setup click here, the remote guide is for:
Access to the database can be made from the server that it is located on and other PC's.
Access to the BI Server can be made from both the server and other PC's.
Access to the Administrator Console can be made from both the server and other PC's.
Tips
Make sure when performing steps that you are always logged in as root:
ubuntu:/# sudo su
fedora:/# su –
redhat:/# su -

Contact And Resources
You can contact me via:
Skype: abhishek.spectrum
Email: spectrumon@gmail.com

Google Talk: spectrumon@gmail.com
Pentaho Official Wik i
Pentaho Forums


Definitions
When I first started using Pentaho BI Server I was confused with what other packages it was made up by.
Here is a list of of packages which are relevant to this tutorial and how Pentaho BI Server uses them:
Apache Tomcat
Description: The web server which provides a platform for the Pentaho BI Server front end.
Port: 8080 (default)
URL access: http://localhost:8080/
Homepage: http://www.apache-tomcat.com/
Hibernate
Description: A Java package which helps the relational database work with Pentaho
Homepage: http://www.hibernate.org/
Jetty
Description: The web server which provides a platform for the Administrator Console front end
Port: 8099 (default)
URL access: http://localhost:8099/
Homepage: http://www.jetty-server.com/
MySQL
Description: The database server which Pentaho will be using
Port: 3306 (default)
Java URL access: jdbc:mysql://localhost/your_db_name
Quartz
Description: The job/report scheduling system which Pentaho uses
Homepage: http://www.opensymphony.com/quartz/
This is a very high level look however the team at Pentaho have started a Pentaho Terminology Project
which goes into most Pentaho terms in a lot more detail.

Preparation
You should already have installed and configured both the latest Java JDK and JRE and MySQL server.
Java JDK And JRE
You can check if you have the latest Java JDK and JRE by typing at the terminal prompt java -version,
you should get the following output:
root@pentaho:/# java -version
java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) Client VM (build 10.0-b23, mixed mode, sharing)
If you don't get an output similar to above, follow these guides to setting up the latest Java package on your
Linux distribution: Ubuntu, Fedora, Suse, CentOS and RedHat.
If you are using Windows you can type at the command prompt java -version you should get the
following output:
C:\>java -version
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) Client VM (build 11.0-b16, mixed mode, sharing)
If you don't get an output similar to above, follow this guide to setting up the latest Java package on Windows
Server, XP or Vista.
We will also need to check if you have set-up your environment variable correctly, once again at the terminal
prompt type echo $JAVA_HOME, you should get an output similar to below:
root@pentaho:/# echo $JAVA_HOME
/usr/lib/jvm/java-6-sun
If you didn't receive this output you have not set-up the JAVA_HOME environment variable correctly please
review the links above to make sure you installed Java JDK and JRE correctly.
To check if you have set-up your environment variable correctly in Windows, type at the command prompt
echo %JAVA_HOME%, you should get an output similar to below:
C:\>echo %JAVA_HOME%

MySQL Server
You can check if MySQL is installed by trying to login into the server, at the terminal prompt login by using
the mysql -u root -ppassword syntax, you should get an output similar to below:
Note!
Make sure you replace password with your actual MySQL server root's password.
root@pentaho:/#mysql -u root -ppassword
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 64
Server version: 5.0.51a-3ubuntu5.4 (Ubuntu)
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql>
If you don't get a similar output as above, follow these links to setting up the latest MySQL Server on your
Linux distribution: Ubuntu, Fedora, Suse, CentOS and RedHat.
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Downloads
Before you start any configuration you will need to download two packages:
1. Pentaho Business Intelligence Server 4.0
2. Sample Database MySQL 5 Script
Pentaho Business Intelligence Server 4.0 Stable
You can get this file two ways, the first way is to go to directly to the Pentaho SourceForge page and
download a copy, click here for a direct link, make sure you download the biserver-ce-4.0.0.stable.tar.gz
file! The other option is to just run the following command at your terminal prompt:
root@pentaho:/# wget
http://internode.dl.sourceforge.net/sourceforge/pentaho/biserver-ce-
4.0.0.stable.tar.gz
Once you have downloaded this file you will need to extract it by using the following command:
root@pentaho:/# tar xvzf biserver-ce-2.0.0-stable.tar.gz
After this operation is complete you should be able to enter a biserver-ce/ directory:
root@pentaho:/# ls
biserver-ce
root@pentaho:/# cd biserver-ce/
root@pentaho:/biserver-ce#
Sample Database MySQL 5 Script
You can also get this file two ways, the first way is to go directly to a I have hosted with the file in it.
Create a user id to get it from below link

The other option is to just run the following command at your terminal prompt:
Note!
To make life easier make sure you download or move the sampledata .SQL script into the biserverce/
data/mysql5/ directory.
root@pentaho:/# cd biserver-ce/data/mysql5
root@pentaho:/biserver-ce/data/mysql5# wget
http://www.prashantraju.com/pentaho/downloads/sampledatamysql5.sql
7
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Load The Sample SQL Scripts
We will need to load 5 sample sql scripts into our MySQL database, these are located in the biserverce/
data/mysql5 directory. You will need to load the scripts in the following order:
1. create_repository_mysql.sql
2. create_quartz_mysql.sql
3. create_sample_datasource_mysql.sql
4. migration.sql
5. sampledatamysql5.sql
To load these files first log into your MySQL server:
root@pentaho:/biserver-ce/data/mysql5# mysql -u root -ppassword
Then run the following commands:
mysql> source create_repository_mysql.sql;
...output
mysql> source create_quartz_mysql.sql;
...output
mysql> source create_sample_datasource_mysql.sql;
...output
mysql> source migration.sql;
...
mysql> source sampledatamysql5.sql;
...output
To check if you have created all the necessary databases at the MySQL terminal prompt type:
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hibernate |
| mysql |
| quartz |
| sampledata |
+--------------------+
5 rows in set (0.01 sec)
You should have the hibernate, quartz and sampledata databases, these are what you just created.
8
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Configure pentaho-solutions/
You will need to modify three files, they are underlined below:
pentaho-solutions/
system/
applicationContext-acegi-security-jdbc.xm l
applicationContext-acegi-security-hibernate.properties
hibernate/
hibernate-settings.xm l
mysql5.hibernate.cfg.xm l
applicationContext-acegi-security-jdbc.xml
What Does This File Do?
Sets up a JDBC authentication for the Spring Security system used on the Pentaho BI Server.
To edit this file enter the /biserver-ce/pentaho-solutions/system/ directory:
root@pentaho:/# cd /biserver-ce/pentaho-solutions/system/
root@pentaho:/biserver-ce/pentaho-solutions/system#
Now open up the applicationContext-acegi-security-jdbc.xml file with your text editor, in this
example I'm using vim:
root@pentaho:/biserver-ce/pentaho-solutions/system# vim
applicationContext-acegi-security-jdbc.xml
Scroll down until you see the snippet of code below, the highlighted areas are the parts which need
modification:
<bean id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.hsqldb.jdbcDriver" />
<property name="url" value="jdbc:hsqldb:hsql://localhost:9001/hibernate" /
>
<property name="username" value="hibuser" />
<property name="password" value="password" />
</bean>
The configuration options of this file are:
driverClassName: The class of the database driver i.e. com.mysql.jdbc.Driver.
url: The url to access the database i.e. jdbc:mysql://localhost:your_mysql_port/database_name.
username: The username which has access to the specified database.
password: The password of the username above.
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
After modification your code should look similar to this:
<bean id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver" />
<property name="url" value="jdbc:mysql://localhost:3306/hibernate" />
<property name="username" value="root" />
<property name="password" value="password" />
</bean>
Warning!
Once again I have just used the root username and password, for a proper and secure installation you
should create a separate user who has similar access as a root user!
Save the file and exit it.
applicationContext-acegi-security-hibernate.properties
What Does This File Do?
Sets the properties for the Spring security to enable a database connection to the Hibernate conection.
This file is located under the same directory as the directory before:
root@pentaho:/# cd biserver-ce/pentaho-solutions/system
root@pentaho:/biserver-ce/pentaho-solutions/system#
Open up the applicationContext-acegi-security-hibernate.properties file with your text
editor, in this example I'm using vim:
root@pentaho:/biserver-ce/pentaho-solutions/system# vim applicationContext-acegisecurity-
hibernate.properties
At the top you should see the snippet of code below, the highlighted areas are the parts which need
modification:
jdbc.driver=org.hsqldb.jdbcDriver
jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.HSQLDialect
The configuration options of this file are:
jdbc.driver: The JDBC driver i.e. com.mysql.jdbc.Driver.
jdbc.url: The url to access the database i.e.
jdbc:mysql://your_mysql_ip_address:your_mysql_port/database_name.
jdbc.username: The username which has access to the specified database.
jdbc.password: The password of the username above.
hibernate.dialect: The database language or dialect that Hibernate needs to use.
After modification your applicationContext-acegi-security-hibernate.properties file should
similar to this:
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/hibernate
jdbc.username=root
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.MySQLDialect
Save the file and exit it.
hibernate-settings.xml
What Does This File Do?
Sets up the basic Hibernate settings i.e. what database back-end to use.
This file is located under the /hibernate directory:
root@pentaho:/# cd biserver-ce/pentaho-solutions/system/hibernate
root@pentaho:/ biserver-ce/pentaho-solutions/system/hibernate#
Open up the hibernate-settings.xml file with your text editor, in this example I'm using vim:
root@pentaho:/biserver-ce/pentaho-solutions/system# vim hibernatesettings.
xml
Scroll down until you see the snippet of code below, the highlighted areas are the parts which need
modification:
<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>
This is currently referencing the hsql hibernate configuration file, you will need to change this to the MySQL
hibernate configuration file; mysql5.hibernate.cfg.xml.
After modification, your hibernate-settings.xml file should look similar to this:
<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>
Save the file and exit it.
mysql5.hibernate.cfg.xml
What Does This File Do?
Configures the MySQL connection for the Hibernate database, this allows managed connections
within the BI Server.
We do not need to change directories as this file is also located under the /hibernate directory, open up
the mysql5.hibernate.cfg.xml file with your text editor, in this example I'm using vim:
root@pentaho:/biserver-ce/pentaho-solutions/system# vim
mysql5.hibernate.cfg.xml
Scroll down until you see the snippet of code below, the highlighted areas are the parts which need
modification:
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property
name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>
<property
name="dialect">org.pentaho.platform.repository.hibernate.MySQL5InnoDBD
ialect</property>
<property name="connection.username">hibuser</property>
<property name="connection.password">password</property>
<property name="connection.pool_size">10</property>
<property name="show_sql">false</property>
<property name="hibernate.jdbc.use_streams_for_binary">true</property>
The core configuration options of this file are:
connection.driver_class: The class of the database driver i.e. com.mysql.jdbc.Driver.
url: The URL to access the database i.e. jdbc:mysql://localhost:port_for_mysql/database_name.
username: The username which has access to the specified database.
password: The password of the username above.
After modification your code should look similar to this:
<property name="connection.driver_class">com.mysql.jdbc.Driver</property>
<property
name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>
<property
name="dialect">org.pentaho.platform.repository.hibernate.MySQL5InnoDBD
ialect</property>
<property name="connection.username">root</property>
<property name="connection.password">password</property>
<property name="connection.pool_size">10</property>
<property name="show_sql">false</property>
<property name="hibernate.jdbc.use_streams_for_binary">true</property>
Warning!
Once again I have just used the root username and password, for a proper and secure installation you
should create a separate user who has similar access as a root user!
Save the file and exit it.

Pentaho BI Server 4.0 With MySQL And Linux Local Setup

Configure tomcat/webapps/pentaho/
You will need to modify two files, they are underlined below:
tomcat/
webapps/
pentaho/
WEB-INF/
web.xm l
META-INF/
context.xm l
web.xml
What Does This File Do?
The main configuration file for Tomcat and Pentaho i.e. sets up all the .JSP's and various other
Pentaho files.
To edit the first file enter the tomcat/webapps/pentaho/WEB-INF/ directory:
root@pentaho:/# cd biserver-ce/tomcat/webapps/pentaho/WEB-INF/
Now open up the web.xml file with your favourite text editor, in this example I'm using vim.
root@pentaho:/biserver-ce/tomcat/webapps/pentaho/WEB-INF/# vim web.xml
Scroll down until you see the snippet of code below, the highlighted areas are the parts which need
modification:
<context-param>
<param-name>base-url</param-name>
<param-value>http://localhost:8080/pentaho/</param-value>
</context-param>
This is currently telling you that the Pentaho installation can only be accessed on the server that is currently
running it. If you would like to access the Pentaho installation from other computers you will need to change
this so that the localhost is replaced with the IP address of the server hosting the Pentaho Installation.
Your web.xml file should now look similar to this (where the xxx.xxx.xxx.xxx is the IP address of your
server):
<context-param>
<param-name>base-url</param-name>
<param-value>http://localhost:8080/pentaho/</param-value>
</context-param>
---or---
<context-param>
<param-name>base-url</param-name>
<param-value>http://xxx.xxx.xxx.xxx:8080/pentaho/</param-value>
</context-param>
Save the file and exit it.
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
context.xml
What Does This File Do?
Configuration file for setting up both Hibernate and Quartz.
To edit the first file you must enter the tomcat/webapps/pentaho/META-INF/ directory:
root@pentaho:/biserver-ce/tomcat/webapps/pentaho/WEB-INF/# cd /biserverce/
tomcat/webapps/pentaho/META-INF
Now open up the context.xml file with your text editor, in this example I'm using vim.
root@pentaho:/biserver-ce/tomcat/webapps/pentaho/META-INF/# vim
context.xml
Scroll down until you see the snippet of code below, the highlighted areas are the parts which need
modification:
<Context path="/pentaho" docbase="webapps/pentaho/">
<Resource name="jdbc/Hibernate" auth="Container"
type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="hibuser" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/
hibernate" validationQuery="select count(*) from
INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />
<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="pentaho_user" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/
quartz" validationQuery="select count(*) from
INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />
</Context>
The core configuration options of this file are:
url – The url to access the Hibernate and Quartz database i.e.
jdbc:mysql://localhost:port_for_mysql/database_name
driverClassName – The class of the database driver for Hibernate and Quartz i.e.
com.mysql.jdbc.Driver
username – The username which has access to the for Hibernate and Quartz database
password – The password of the username above
validationQuery – This is a query to check if the database is present, for now I'm going to just
remove this line, if you want you can type in your own select statement in here i.e. SELECT 1
14
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
After modification your code should look similar to this:
<Context path="/pentaho" docbase="webapps/pentaho/">
<Resource name="jdbc/Hibernate" auth="Container"
type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="root" password="password"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/hibernate" />
<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="root" password="password"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/quartz" />
</Context>
---or---
<Context path="/pentaho" docbase="webapps/pentaho/">
<Resource name="jdbc/Hibernate" auth="Container"
type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="root" password="password"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/hibernate" validationQuery="SELECT 1" />
<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20"
maxIdle="5" maxWait="10000" username="root" password="password"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/quartz" validationQuery="SELECT 1"/>
</Context>
Save this file and exit it.
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Configure administration-console/
No configuration is needed for the administrator console with the stable release of Pentaho BI Server 2.0.
16
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Start The BI Server & Administrator Console
We will need to first start the BI Server and then the Administrator Console.
To start the BI Server, enter the /tomcat/bin directory which is within the biserver-ce/ directory.
root@pentaho:/biserver-ce/# cd /tomcat/bin
root@pentaho:/biserver-ce/tomcat/bin/#
To start the Tomcat web server all you need to do is run startup.sh from within the /tomcat/bin
directory:
root@pentaho:/biserver-ce/tomcat/bin/# ./startup.sh
To stop the Tomcat web server all you have to do is run shutdown.sh from within the /tomcat/bin
directory:
root@pentaho:/biserver-ce/tomcat/bin/# ./shutdown.sh
After running either command you should get the following output:
Using CATALINA_BASE: /home/pgraju/biserver-ce/tomcat
Using CATALINA_HOME: /home/pgraju/biserver-ce/tomcat
Using CATALINA_TMPDIR: /home/pgraju/biserver-ce/tomcat/temp
Using JRE_HOME: /usr/lib/jvm/java-6-sun
Before starting up the Administrator Console make sure that Tomcat and the BI Server has successfully
started, to do this open up a web browser and type in http://localhost:8080/ or http://ip_address:8080/ (this is
the IP address you have set in your web.xml file).
You should now see the Tomcat welcome homepage:



Pentaho BI Server 2.0 With MySQL And Linux Local Setup
If Tomcat was successful, try and launch the BI Server, after your URL add pentaho i.e. http://localhost:8080/
pentaho or http://your_ip_address:8080/pentaho.
You should now see the Pentaho login screen:


Close or minimize your browser and open up a new instance of the terminal and enter the
/administrator-console directory:
root@pentaho:/biserver-ce/# cd administrator-console
root@pentaho:/biserver-ce/administrator-console/#
To start the Administrator Console you will need to run start.sh from within the administrator-console/
directory:
root@pentaho:/biserver-ce/administrator-console/# ./start.sh
To stop the Administrator Console you will need to run stop.sh from within the administratorconsole/
directory:
root@pentaho:/biserver-ce/administrator-console/# ./stop.sh
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
After running this command you should get the following output:
2009-01-27 01:00:56.800::INFO: Logging to STDERR via
org.mortbay.log.StdErrLogJan 27, 2009 1:00:56 AM
org.pentaho.pac.server.JettyServer startServerINFO: Console is starting
2009-01-27 01:00:56.055::INFO: jetty-6.1.2
2009-01-27 01:00:56.133::INFO: Started SocketConnector @ 0.0.0.0:8099
Jan 27, 2009 1:00:56 AM org.pentaho.pac.server.JettyServer startServer
INFO: Console is now started. It can be accessed using
or http://161.117.117.13:8099
Now lets try and launch the Administrator Console, open up your browser and visit http://localhost:8099/ or
http://ip_adresss:8099/ (this is the IP address you set in your console.xml file).
You will be prompted to login, use the username admin and the password password to login.
Once logged in you should see the Administrator Console homepage.



Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Configure The SampleData Data Source
While the Administrator Console is running you will need to edit the sample database data source, with the
Administrator Console this is very simple. First click on the Administration link on the left hand side.
The next step is to click on the Data Sources tab which is on the right hand side:
This should bring up list of current data sources if you have a fresh install of the BI Server you should only
see the SampleData data source, click on the name to populate the fields on the right hand side.




Pentaho BI Server 2.0 With MySQL And Linux Local Setup
You will need to edit 5 fields for the SampleData data source:
Under the General section:






Driver Class: This is the driver class of the data source connection, at the moment it is set to use a
hsql driver class, change this to com.mysql.jdbc.Driver.
User Name: Change this to a user who has the correct privileges to access the sampledata
database.
Password: The password to the user you specify above.
URL: This is the URL which you can access the database on, at the moment this is set to a hsql
URL, change this to jdbc:mysql://localhost:3306/sample.data
Under the Advanced section:
Validation Query: Delete any SQL statements which are present and leave it blank or enter a
MySQL query which is valid i.e. SELECT 1



Once you are done, click on the Test button and you should receive a message letting you know that it has
made a successful connection.
You have successfully set-up your SampleData data source to use your MySQL database.
Before this is reflected in the BI Server you will need to restart Tomcat and the Administrator console.
To stop Tomcat, enter the /biserver-ce/tomcat/bin/ directory and run shutdown.sh:
root@pentaho:/# cd /biserver-ce/tomcat/bin
root@pentaho:/biserver-ce/tomcat/bin/# ./shutdown.sh
Next, stop the Administrator console, enter the /administrator-console/ directory and run stop.sh or
CTRL+C if you are in the terminal which is outputting a log of the Administrator Console:
root@pentaho:/biserver-ce/tomcat/bin/# cd ../../administrator-console
root@pentaho:/administrator-console/# ./stop.sh
Now start Tomcat back up and then the Administrator Console.
To start Tomcat, enter the /biserver-ce/tomcat/bin/ directory and run startup.sh:
root@pentaho:/# cd /biserver-ce/tomcat/bin
root@pentaho:/biserver-ce/tomcat/bin/# ./startup.sh
Pentaho BI Server 2.0 With MySQL And Linux Local Setup
Next, start the Administrator console, enter the /administrator-console/ directory and run start.sh:
root@pentaho:/biserver-ce/tomcat/bin/# cd ../../administrator-console
root@pentaho:/administrator-console/# ./start.sh
Login into the BI Server as any of the sample users and then try and run a sample report, if all went well then
you have successfully setup Pentaho Business Intelligence Server RC1 with MySQL on Linux!
Pentaho BI Server 2.0 With MySQL And Linux Local Setup



Common Errors
Here is a list of common errors which I and others encountered on setting up the BI Server with MySQL on
Linux and of course the solutions!
Pentaho Initialization Error
Pentaho Initialization Error
The following errors were detected
One or more system listeners failed. These are set in the pentaho.xml under the node
<system-listeners>.
PentahoSystem.ERROR_0014 – Error while trying to execute startup sequence for
org.pentaho.platform.scheduler.QuartzSystemListener
Please see the server console for more details on each error detected.
This error can occur with both Quartz and Hibernate, in this example it is with Quartz.
Most of the time this error occurs because you have not setup your context.xml file correctly, please read
over this part of the guide again and make sure you have not made any typos or entered incorrect settings.
Can't Run Sample Reports
If you are getting an error when trying to run a sample report you have not configured the sample data data
source correctly within the Administrator Console, read over this part of the guide to make sure you have not
entered incorrect settings.
Manual Deployment of Pentaho on Jboss with Mysql
  1. Download and unpack Download the necessary files from http://sourceforge.net/projects/pentaho/files
  2. biserver-ce-3.9.zip(win) or tar (Linux)
  3. biserver-manual-ce-3.9.zip(win) or tar (Linux)
  4. download and install ant
  5. set class path of ant in environment (windows)
  6. set class path in Linux using .profile file

Unpack the zip/tar files.
after extracting you will get two directories: biserver-ce and biserver-manual-ce.
  1. Build customized war file In order to build an war file that is customized for Jboss/MySQL, go to the biserver-manual-ce folder. Run "ant -p" to view the project help information. The build targets are shown. Build the JBoss/MySQL target without portal: "ant war-pentaho-jboss-mysql-no-portal". This results in the file build/pentaho-ears/jboss/no-portal/mysql5/pentaho.war.
  2. deploy pentaho.war file in deploy directory …\jboss4\jboss-4.2.1.GA\server\default\deploy
  3. deoloy pentaho style war in deploy directoy (0ptional)
  4. copy pentaho-solution from manual deployment in ….\jboss-4.2.1.GA\
  5. open web.xml from ….\jboss4\jboss-4.2.1.GA\server\default\deploy\pentaho.war\WEB-INF
  6. edit line number 10 and put absolute path of solution folder
  7. solutionpath should look like..
  8. solution-path

….\jboss4\jboss-4.2.1.GA\biserver-ce\pentaho-solutions
  1. Paste mysql jar file in

…..\jboss4\jboss-4.2.1.GA\lib 11.Create User in mysql database root identified by password password
12.And run all db script manually
13.Give execute permition to all file
14.Start jboss …..batch or shell (.bat or .sh)
15.Request http://localhost:8080/pentaho/Login from web browser
Detail configuration
Change databse and dialect in following files
1)hibernate-settings.xml …biserver-ce\pentaho-solutions\system\hibernate\hibernate-settings.xml
2)applicationContext-spring-security-hibernate.properties …. biserver-ce\pentaho-solutions\system\applicationContext-spring-security-hibernate.properties
3)change in systemListeners.xml Comment versionCheckerSystemListener in pentaho-solutions\system\systemListeners.xml
mail me in case of any issue