Wednesday 14 August 2019

How to load \ retrieve data to neo4j using Pentaho

Neo4j and Pentaho (PDI)
This document explain how to connect neo4j with PDI and load and retrieve data from PDI, Load Data from CSV to neo4j
Neo4j prerequisite
aNeo4j should be up and running
      All required credentials including username and password should be available
c  Example below screenshot.
   Download working copy of above example from here and 2ndFile
   Load data to neo4j from csv download working copy from here and example  csv file from here




How to Connect Neo4j from PDI
1. Get JDBC driver
d)     Get the jdbc driver from below location.This driver is pre complied and ready to use.this driver hasbeen tested with pentaho pdi 8.2 and Neo4j desktop 3.5.6.
e)     http://dist.neo4j.org/neo4j-jdbc/neo4j-jdbc-2.0.1-SNAPSHOT-jar-with-dependencies.jar
f)     Add above driver to add to the <pdi Istalation directoey>/data-integration\lib folder
g)    Restart Spoon
2. Create Connection though pentaho
1.Open spoon
2.open new transformation
3.Select Table input step
4.go for  creating new connection



5.Create New connection As per below
·         Connection type as Generic database
·         Custom connection URL as jdbc:neo4j://localhost:7474
·         Custom driver class name as org.neo4j.jdbc.Driver


6.Test connection:-




2. Load Data to Neo4j using Pentaho(PDI)
1.Select Execute SQL script
2.Put SQL script into the  " Execute SQL script "  step. Refer below pic attached.


Run the transformation:
After successful run this will load one record "SHANKAR" to Neo4j



2. Retrieve Data from Neo4j using Pentaho(PDI)
1.Select Table input step  
2.Put SQL script into the  " Table input "  step. Refer below pic attached.



Run the transformation:
After successful run one record will be retrived from Neo4j,
below is log snip shot
2019/08/14 14:01:20 - Write to logs.0 - ------------> Linenr 1------------------------------
2019/08/14 14:01:20 - Write to logs.0 - ====Data retrived from neo4j========
2019/08/14 14:01:20 - Write to logs.0 -
2019/08/14 14:01:20 - Write to logs.0 - Shankar = {"born":1982,"name":"Shankar"}



Load Data from CSV file to neo4j

Load your Csv file to neo4j installation directory or you can put ditrctly http,https,FTP location as well.


<Instalation directory >\.Neo4jDesktop\neo4jDatabases\database-39ba8418-e334-4730-b8b2-1434f4d6db48\installation-3.5.6\import\desktop-csv-import\<csv file name>





Download working copy of above example from here and 2ndFile
https://drive.google.com/file/d/1FgJRNbRogl4OhmPPHLBQVFtecyoqE88R/view?usp=sharing
https://drive.google.com/file/d/15Y1ySRDYpzYu3L-vzxFX5xKyjsEKowia/view?usp=sharing

Load data to neo4j from csv download working copy from here and expale csv file from here
https://drive.google.com/file/d/19C-91CvUW3bv9UanSBbfID9kmFOVTBDv/view?usp=sharing
https://drive.google.com/file/d/1500NY0LKUovBexM3dS7P4wmwJtq_XjLF/view?usp=sharing

Some useful Cypher commands:-

1.Load Data from CSV file to neo4j without headers.

LOAD CSV FROM 'file:///desktop-csv-import/NeotestCSV2.csv' AS line

 CREATE (:Artist2 { Test: line[1], Name: (line[2])})

.Load Data from CSV file to neo4j with headers.

LOAD CSV FROM 'file:///desktop-csv-import/NeotestCSV2.csv' AS line
 CREATE (:Artist2 { Test: line[1], Name: (line[2])})

1.check count of loaded record
 MATCH (p:Artist)
 RETURN count(p)

2.Select record from Lable (Table in neo4j)
MATCH (p:Artist)
 RETURN p

3. Get the queryId by useing below command
 CALL dbms.listQueries()

4.Kill Running query in neo4j
example:-
CALL dbms.killQuery('query-685')

No comments:

Post a Comment