OLS ready for installing at 19c using Docker + source code at GitHub

Some time ago We released 3 scripts for installing Scotas OLS using Oracle Docker images for 19c release, here an step by step instructions for installing OLS using Oracle 19c.

The procedure is similar to Oracle XE, first check if you have a Docker image for 19c

$ docker image ls|grep oracle
oracle/database                                   19.3.0-ee           a8bf34972c5c        5 months ago        6.5GB

If not, check Oracle Readme at GitHub repo for build above image.

Once you haven a Docker Image for 19.3.0-ee release create three docker swarm config files to proceed with the OLS installing process during first startup.

$ curl -s https://raw.githubusercontent.com/scotas/docker-images/master/ols-scripts-r5/00-unzip-ols.sh | docker config create 00-unzip-ols.sh –
$ curl -s https://raw.githubusercontent.com/scotas/docker-images/master/ols-scripts-r5/01-ols-ins.sh | docker config create 01-ols-ins.sh –
$ curl -s https://raw.githubusercontent.com/scotas/docker-images/master/ols-scripts-r5/02-clean-up-ols-files.sh | docker config create 02-clean-up-ols-files.sh –
$ docker config ls
ID                                                        NAME                                           CREATED            UPDATED
k8ntw4p1w06wr5bifbx6w3ztz           00-unzip-ols.sh                               53 minutes ago     53 minutes ago
zj9link1pqi04ddgxtnwu81j4               01-ols-ins.sh                                   51 minutes ago     51 minutes ago
pnf61s3l0vxmhqoj5twfrw688            02-clean-up-ols-files.sh                  51 minutes ago      51 minutes ago

finally create docker-compose.yml file for you Oracle OLS RDBMS deploy, the service definition for OLS DB look like:

Screenshot from 2019-12-07 17-04-40

OLS service definition

 

and configs and external storage:

other related services into stack definition is for development purpose, rember that Scotas OLS is at GitHub repo and free as in beer.

Before creating your RDBMS with OLS installed check if your persistent oracle database data directory have UID/GID 54321:54321 if you already do that just create your Docker Swarm stack using:

$ docker stack deploy -c docker-compose.yml ols

once the DB is created and OLS installed you will have a log output like:

$ docker service logs -f ols_db
……..
ols_db.1.a37moycgrhtv@pocho | #########################
ols_db.1.a37moycgrhtv@pocho | DATABASE IS READY TO USE!
ols_db.1.a37moycgrhtv@pocho | #########################

simple enough no? have fun using Scotas OLS on 19c!!!

Push Connector 2.0 for Oracle XE 18c

At the end of 2018 Oracle finally released a long awaited 18c XE release for Linux and a few days ago Windows version.

This new release included a native support for Json inside the Oracle Database this is a good new because:

  •  standardize the API (no more dependency over PLJson Library or Apex JSON)
  • performance, this support is complete implemented in C

using this new feature We complete rewrite Scotas Push Connector for Solr using the new API and due XE is free, Scotas Push connector is delivered also as Open Source version with the source at GitHub!!!

To deploy this new version is simple as deploy a new Docker stack on your Swarm cluster or single server, a sample stack is at docker-compose.yml file. Here main components:

Screenshot from 2019-02-26 10-53-15

 

 

 

 

 

 

 

 

first part of the stack is to start a new Solr instance to assist Solr Push connector searching facilities, remember that Push Connector implements Searching functionality using outside instance of Apache Solr.

Screenshot from 2019-02-26 10-56-03

 

 

 

 

second service is an Oracle RDBMS instance with two important parts, an external storage for the database data as the Oracle Docker image suggest, you can do the same with the persistent storage of Apache Solr image, also We define as environment variables the initial SYS password for the XE instance (this is mandatory during PC installation) and the hostname and port for Solr instance.

Finally We provide into the stack the startup scripts for installing Scotas Push Connector 2.0.0, this scripts are automatically called just after the Oracle XE instance is created.

Screenshot from 2019-02-26 11-04-23

 

 

 

to deploy this stack either in Swarm cluster or a simple node with Docker installed just proceed with these steps:

  • Checkout GitHub scripts

$ git clone https://github.com/scotas/docker-images.git
$ cd docker-images/pc-18cXE

  • Create configs for Oracle XE startup scripts

$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/00-unzip-pc.sh|docker config create 00-unzip-pc –
$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/01-pc-ins.sh|docker config create 01-pc-ins –
$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/02-clean-up-files.sh|docker config create 02-clean-up-files -

  • Deploy sample stack

$ cd ../sample-stacks
$ docker stack deploy pc -c docker-compose-pc.yml
$ docker service logs -f pc_db
pc_db.1.y7826xns4ig6@pocho | ORACLE PASSWORD FOR SYS AND SYSTEM: Oracle_2018
….
pc_db.1.y7826xns4ig6@pocho | old 33: insert into PC.bg_process (bg_process_name,host_name,port) values (‘Default &connector Server’,’&newhost’,&newport);
pc_db.1.y7826xns4ig6@pocho | new 33: insert into PC.bg_process (bg_process_name,host_name,port) values (‘Default solr Server’,’solr’,8983);
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | PL/SQL procedure successfully completed.
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | Elapsed: 00:00:10.63
pc_db.1.y7826xns4ig6@pocho | Disconnected from Oracle Database 18c Express Edition Release 18.0.0.0.0 – Production
pc_db.1.y7826xns4ig6@pocho | Version 18.4.0.0.0
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | /opt/oracle/runOracle.sh: running /opt/oracle/scripts/setup/02-clean-up-files.sh
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | DONE: Executing user defined scripts
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | The Oracle base remains unchanged with value /opt/oracle
pc_db.1.y7826xns4ig6@pocho | #########################
pc_db.1.y7826xns4ig6@pocho | DATABASE IS READY TO USE!
pc_db.1.y7826xns4ig6@pocho | #########################


Testing your Solr Push connector

Once you have the stack deployed and your XE ready to use a simple test could be executed connecting to Oracle and creating an scott account, for example:

$ /opt/sqlcl/bin/sql
Username? (”?) sys/Oracle_2018@localhost:1521/XEPDB1 as sysdba
Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 – Production
SQL> create user scott identified by tiger
2 default tablespace users
3 temporary tablespace temp
4* quota unlimited on users;
User SCOTT created.

SQL> grant connect,resource,pcuser to scott;

Grant succeeded.


Preparing Solr to index a sample table

Unlike Scotas OLS which use Apache Solr inside the RDBMS, Push Connector technology relies on an external Solr instance to prepare Solr mappings for storing a sample table indexing just do these steps in shell inside the Solr instance:

$ docker exec -ti pc_solr.1.tn78o19x9qov3635yhkyyea6r /bin/bash
solr@solr:/opt/solr$ bin/solr create_core -c source_big_pidx
….
Created new core ‘source_big_pidx’
solr@solr:/opt/solr$ curl -X POST -H ‘Content-type:application/json’ –data-binary ‘{
> “add-field”:{
> “name”:”rowid”,
> “type”:”string”,
> “indexed”:true,
> “stored”:true,
> “required”:true,
> “multiValued”:false
> },
> “add-field”:{
> “name”:”solridx”,
> “type”:”string”,
> “indexed”:true,
> “stored”:true,
> “required”:true,
> “multiValued”:false
> }
> }’ http://localhost:8983/solr/source_big_pidx/schema
{
“responseHeader”:{
“status”:0,
“QTime”:336}}
solr@solr:/opt/solr$ sed -i ‘s/<uniqueKey>id/<uniqueKey>rowid/g’ /opt/solr/server/solr/source_big_pidx/conf/managed-schema
solr@solr:/opt/solr$ curl “http://localhost:8983/solr/admin/cores?action=RELOAD&core=source_big_pidx”
{
“responseHeader”:{
“status”:0,
“QTime”:1253}}
solr@solr:/opt/solr$ curl -X POST -H ‘Content-type:application/json’ –data-binary ‘{
> “add-field”:{
> “name”:”title”,
> “type”:”text_general”,
> “indexed”:true,
> “stored”:true,
> “multiValued”:true
> },
> “add-copy-field” :{
> “source”:”*”,
> “dest”:”_text_”
> }
> }’ http://localhost:8983/solr/source_big_pidx/schema
{
“responseHeader”:{
“status”:0,
“QTime”:188}}
solr@solr:/opt/solr$ curl http://localhost:8983/solr/source_big_pidx/schema/uniquekey?wt=json
{
“responseHeader”:{
“status”:0,
“QTime”:0},
“uniqueKey”:”rowid”}


Index your first table

Once Apache Solr is ready to receive Push Connector indexing information We can do these:

SQL> conn scott/tiger@localhost:1521/XEPDB1
Connected.
SQL> set long 10000 lines 140 pages 50 timing on echo on
SQL> set serveroutput on size 1000000
SQL> create table test_source_big as (
2 select owner,name,type,line,text from (select rownum as ntop_pos,q.* from
3 (select * from all_source) q)
4 where ntop_pos>=0 and ntop_pos<=5000
5 );

Table TEST_SOURCE_BIG created.

Elapsed: 00:00:05.460
SQL> CREATE INDEX SOURCE_BIG_PIDX ON TEST_SOURCE_BIG(TEXT)
2 INDEXTYPE IS PC.SOLR
3 parameters(‘{LogLevel:”INFO”,
4 Updater:”solr@8983″,
5 Searcher:”solr@8983″,
6 SolrBase:”solr/source_big_pidx“,
7 SyncMode:”OnLine”,
8 BatchCount:5000,
9 CommitOnSync:true,
10 LockMasterTable:false,
11 IncludeMasterColumn:false,
12 DefaultColumn:”text”,
13 HighlightColumn:”title”,
14 ExtraCols:””text” value text,”title” value substr(text,1,256),”line_i” value line,”type_s” value type”}’);

Index SOURCE_BIG_PIDX created.

Elapsed: 00:00:05.923

Note that ExtraCols parameter is using latest json_object column syntax.


Doing some queries

If you look at the logs of Solr instance during the indexing time you will see something like:

$ docker service logs -f pc_solr

pc_solr.1.tn78o19x9qov@pocho | 2019-01-28 13:00:50.281 INFO (qtp735937428-21) [ x:source_big_pidx] o.a.s.u.DirectUpdateHandler2 end_commit_flush
pc_solr.1.tn78o19x9qov@pocho | 2019-01-28 13:00:50.285 INFO (qtp735937428-21) [ x:source_big_pidx] o.a.s.u.p.LogUpdateProcessorFactory [source_big_pidx] webapp=/solr path=/update/json params={waitSearcher=true&ident=on&commit=true&softCommit=false&wt=json&expungeDeletes=false}{add=[AAAR79AAMAAAADjAAA (1623909145638862848), AAAR79AAMAAAADjAAB (1623909145757351936), AAAR79AAMAAAADjAAC (1623909145761546240), AAAR79AAMAAAADjAAD (1623909145765740544), AAAR79AAMAAAADjAAE (1623909145769934848), AAAR79AAMAAAADjAAF (1623909145772032000), AAAR79AAMAAAADjAAG (1623909145775177728), AAAR79AAMAAAADjAAH (1623909145778323456), AAAR79AAMAAAADjAAI (1623909145781469184), AAAR79AAMAAAADjAAJ (1623909145785663488), … (5000 adds)],commit=} 0 6174

it tell you that Apache Solr is receiving from XE instance batchs of 5000 rows for indexing.

After that here a sample outputs of querying Oracle with Scotas Push connector domain index functionality, count hits:

SQL> select SolrPushConnector.countHits(‘SOURCE_BIG_PIDX’,’*:*’) from dual;

SOLRPUSHCONNECTOR.COUNTHITS(‘SOURCE_BIG_PIDX’,’*:*’)
—————————————————-
5000

Elapsed: 00:00:00.030

query with highlighting:

SQL> select sscore(1),shighlight(1) from test_source_big where scontains(text,'”procedure java”~10′,1)>0 order by sscore(1) desc;

SSCORE(1)
———-
SHIGHLIGHT(1)
——————————————————————————————————————————————–
1.3910314
{“title”:[” — <em>procedure</em> executing the recursive statement via DBMS_SQL, a <em>Java</em> stored\n”]}
Elapsed: 00:00:00.347

domain index pagination:

SQL> select /*+ DOMAIN_INDEX_SORT */ sscore(1) sc,text from test_source_big where scontains(text,’rownum:[1 TO 10] AND function’,1)>0 order by sscore(1) asc;
1.3612658
member function getBlobVal(csid IN number, pflag IN number, indent IN number) return BLOB deterministic parallel_enable
…………
1.5529456
If it is based on a user defined type, this function will return
10 rows selected.

Elapsed: 00:00:00.112

faceting and pivot:

SQL> declare
2 obj f_info;
3 BEGIN
4 obj := SolrPushConnector.facet(‘SOURCE_BIG_PIDX’,null,’facet.field=type_s&facet.limit=5&facet.pivot=type_s,line_i’);
5 dbms_output.put_line(obj.fields.to_string);
6 dbms_output.put_line(obj.pivots.to_string);
7 END;
8 /
{“type_s”:[“PACKAGE”,13493,”TYPE”,6438,”FUNCTION”,64,”PROCEDURE”,5]}
{“type_s,line_i”:[{“field”:”type_s”,”value”:”PACKAGE”,”count”:13493,”pivot”:[{“field”:”line_i”,”value”:1,”count”:20},{“field”:”line_i”,”valu
e”:2,”count”:19},{“field”:”line_i”,”value”:3,”count”:19},{“field”:”line_i”,”value”:4,”count”:19},{“field”:”line_i”,”value”:5,”count”:19}]},{
“field”:”type_s”,”value”:”TYPE”,”count”:6438,”pivot”:[{“field”:”line_i”,”value”:1,”count”:456},{“field”:”line_i”,”value”:2,”count”:429},{“fi
eld”:”line_i”,”value”:3,”count”:346},{“field”:”line_i”,”value”:4,”count”:295},{“field”:”line_i”,”value”:5,”count”:283}]},{“field”:”type_s”,”
value”:”FUNCTION”,”count”:64,”pivot”:[{“field”:”line_i”,”value”:1,”count”:14},{“field”:”line_i”,”value”:2,”count”:14},{“field”:”line_i”,”val
ue”:3,”count”:10},{“field”:”line_i”,”value”:4,”count”:6},{“field”:”line_i”,”value”:5,”count”:5}]},{“field”:”type_s”,”value”:”PROCEDURE”,”cou
nt”:5,”pivot”:[{“field”:”line_i”,”value”:1,”count”:1},{“field”:”line_i”,”value”:2,”count”:1},{“field”:”line_i”,”value”:3,”count”:1},{“field”
:”line_i”,”value”:4,”count”:1},{“field”:”line_i”,”value”:5,”count”:1}]}]}
PL/SQL procedure successfully completed.

Elapsed: 00:00:00.080

more samples at the source at the script testSourceBig.sql and tutorial directory.

Scotas OLS official Docker image build scripts

We uploaded to GitHub two official Docker build image scripts for Scotas OLS 2.0.1 (RDBMS 12c R1) and OLS 2.0.2 (RDBMS 12c R2).

Both scripts are based on Oracle official RDBMS 12c Docker build images scripts, and they make the installation of Oracle/Scotas OLS too easy.

It basically start by using your own Oracle 12c Docker image and install OLS’s binaries and license. Follow these steps to get your OLS up and running on Linux/Mac/Windows:

First check that you have the 12c image ready, for example:

[mochoa@localhost ols-12cr2]$ docker image ls|grep oracle
….
oracle/database                12.2.0.1-ee                 1a870e15682d                6 weeks ago            14.8 GB
….

then checkout our scripts from GitHub using:

[mochoa@localhost tmp]$ git clone https://github.com/scotas/docker-images.git
Cloning into ‘docker-images’…
remote: Counting objects: 39, done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 39 (delta 14), reused 32 (delta 10), pack-reused 0
Unpacking objects: 100% (39/39), done.
Checking connectivity… done.
[mochoa@localhost tmp]$ cd docker-images/ols-12cr2/
[mochoa@localhost ols-12cr2]$ ./buildDockerImage.sh

.. Docker build steps here (requires internet connection)..

once you have Scotas OLS image ready at your local Docker images repository you can install OLS at first startup of your Docker container using:

docker run –name <container name> \
-p <host port>:1521 -p <host port>:5500 \
-e ORACLE_SID=<your SID> \
-e ORACLE_PDB=<your PDB name> \
-e ORACLE_PWD=<your database passwords> \
-e ORACLE_CHARACTERSET=<your character set> \
-v [<host mount point>:]/opt/oracle/oradata \
ols-official:2.0.2

Important: unlike Oracle official Docker image script first run requires ORACLE_PWD argument, it doesn’t work with random generated password. During this first startup your RDBMS will create all the datafiles at <host mount point> directory and unpack and install Scotas OLS, you will see some like this:

ODCI patched
Installing OLS…
OLS installed see /home/oracle/install-OLS.log and /home/oracle/install-OLS.err files for details
OLS Installed OK….
#########################
DATABASE IS READY TO USE!
#########################

after a successful Database creation you can start/stop the container, rm and run again without the ORACLE_PWD argument.

Note: that the CDB and PDB$SEED are patched against an update version of ODCI.jar library, the included library for RDBMS 12c R1 and R2 are outdated and doesn’t have a correct wrapper files for latest ODCI library, see this post for more details.

Note: During first startup you will see an ora_m00n_pid process consuming a lot of CPU, this is because once you start using Java inside the Database a JIT compilation phase start converting Java code to assembler, it includes a complete Java/Oracle stack and Scotas OLS code, after a week of OLS usage all Java code must be compiled to assembler and this process only takes some CPU usage, obviously when all the code is compiled your indexing and query time will be faster.

Installing Scotas OLS at 12c/Docker

Scotas OLS is Solr Lucene + Oracle

Docker is the new way for DevOps operation specially during development process.

There is a possibility to run Scotas OLS with Oracle 12c running on Docker following this post On docker, Ubuntu and Oracle RDBMS.

Assuming that there is a Docker – Oracle 12c template ready for use:

# docker images
REPOSITORY TAG
oracle-12102 latest
solr/ols latest
alvinhenrick/hadoop-nn-dn latest
alvinhenrick/hadoop-dn latest
alvinhenrick/hadoop-base latest
alvinhenrick/serf latest
java openjdk-6-jdk
ubuntu latest
ubuntu trusty
oraclelinux 6

once We have the image ready first step is to prepare a shared directory at the host machine to be public in a test container, here the steps:

# mkdir -p /var/lib/docker/db/test
# chown 54321:54321 /var/lib/docker/db/test
# cp /var/lib/docker/dockerfiles/build-oracle-12102/db_install-simple.dbt /var/lib/docker/db/test/db_install.dbt

db_install.dbt must have enabled at least these two features JSERVER and XDB_PROTOCOLS, first feature is obviously required because Scotas OLS is entirely written in Java (Apache Solr) and second property is because Scotas OLS storage could be exported using XMLDB repository (WebDAV/ftp).

First Docker image run operation will create an empty database named test, for example:

# docker run –privileged=true –ipc=host –volume=/var/lib/docker/db/test:/u01/app/oracle/data –name test –hostname test –detach=true –publish=1521:1521 –publish=9099:9099 oracle-12102

It will take a long time according to your hardware, in the mean time a log operation could be checked using:

# docker logs -f test

once the Oracle RDBMS is created a good option is to stop and start again to see if everything is fine:

# docker stop test
# docker rm test

Note that docker rm operation will not remove your created DB, it only removes changes at the docker image created during the operation, you could check that by inspecting the shared directory:

# cd /var/lib/docker/db/test
# ls
DATABASE_IS_SETUP db_install.dbt test

now to install Scotas OLS We have to copy install zip file, license and patched ODCI.jar as is described in the post Upgrading system libraries…, for example:

# cp /tmp/lucene-ols-bin-2.0.1.zip /var/lib/docker/db/test
# cp /tmp/ODCI.jar /var/lib/docker/db/test
# cp /tmp/valid.ols

these files copied at the host machine will available at the Docker image during next run:

# docker run –privileged=true –ipc=host –volume=/var/lib/docker/db/test:/u01/app/oracle/data –name test –hostname test –detach=true –publish=1521:1521 –publish=9099:9099 oracle-12102
# docker exec -ti test bash
[root@test oracle]# id
uid=0(root) gid=0(root) groups=0(root)

by executing last step We attach to a running images interactive using a bash command, according to the  oracle-12102 template image the shell script is connected as root.

First step is to log as oracle owner and check SQLNet connectivity and SYS password, oracle-12102 template left SYS password defined as oracle:

[root@test oracle]# su -l oracle
[oracle@test ~]$ tnsping test
TNS Ping Utility for Linux: Version 12.1.0.2.0 – Production on 01-SEP-2015 13:49:19
Copyright (c) 1997, 2014, Oracle. All rights reserved.
Used parameter files:

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = test)))
OK (0 msec)
[oracle@test ~]$ sqlplus “sys/oracle@test as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Tue Sep 1 13:49:30 2015
Copyright (c) 1982, 2014, Oracle. All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

if everything is fine We can start installing Scotas OLS, first patching the ODCI.jar, updating Scotas OLS license and running the installer:

[oracle@test ~]$ cd /tmp
[oracle@test tmp]$ unzip /u01/app/oracle/data/lucene-ols-bin-2.0.1.zip
Archive: /u01/app/oracle/data/lucene-ols-bin-2.0.1.zip
… lots of messages here …
[oracle@test tmp]$ cp /u01/app/oracle/data/ODCI.jar ols/
[oracle@test tmp]$ cp /u01/app/oracle/data/valid.ols ols/bin/com/scotas/license/valid.ols
[oracle@test tmp]$ cd ols
[oracle@test ols]$ loadjava -f -r -v -s -g public -u sys/oracle@test ODCI.jar
…. lots of messages here ….
[oracle@test ols]$ ant -Ddb.str=test -Ddb.tbl=SYSAUX -Ddba.usr=sys -Ddba.pwd=oracle
…. lots of output here ….

last step start Scotas OLS installer using sys user, oracle password and the target database test using SYSAUX tablespace (created by default during installation).

At this point Scotas OLS should be installed you can start using it by shutdown/startup your database or if its not possible enabling manually the OLS Parallel Shared Server process:

[oracle@test ols]$ ant -Ddb.str=test -Ddba.usr=sys -Ddba.pwd=oracle enable-jobs
… some SQLPlus messages here ….
BUILD SUCCESSFUL
Total time: 1 second

Now We are ready to test Scotas OLS running in a Docker container, there are some test prepared to run as Scott but Scott user is now included by default into the RDBMS installation, We can create them doing these SQL commands (logged as SYS):

SQL> create user scott identified by tiger
default tablespace SYSAUX
temporary tablespace TEMP
quota unlimited on SYSAUX;
SQL> grant connect,resource,luceneuser to scott;

SYSAUX tablespace is not a right place to hold scott’s data but is the tablespace available into a default RDBMS installation.

With this user We can run some test using Ant tool, for example Scotas OLS tutorial by editing the file db/sqlunit.properties

[oracle@test ols]$ cat db/sqlunit.properties
# SQLUnit test
sqlunit.driver = oracle.jdbc.driver.OracleDriver
sqlunit.url = jdbc:oracle:oci:@test
sqlunit.user = scott
sqlunit.password = tiger
[oracle@test ols]$ ant test-tutorial

One or more SQLUnit Tests will fail, see sqlunit-test-tutorial.txt for details, this is caused some time by different timezone configuration but most of the test will pass.

Another test is test-parallel-ols, this test simulates a heavy multi-tier application which insert, update, delete, search and count hits on a simple table, the parameters for this should be changed directly into build.xml file, for example:

<target name=”test-parallel-ols” description=”Run Test on Domain Index parallel functionality OLS”>
<junit printsummary=”yes” fork=”no” haltonfailure=”yes” showoutput=”yes”>
<sysproperty key=”db.usr” value=”scott“/>
<sysproperty key=”db.pwd” value=”tiger“/>
<sysproperty key=”db.str” value=”test“/>
<sysproperty key=”idx.upd” value=”0″/>
<sysproperty key=”idx.sch” value=”0,1″/>
<sysproperty key=”db.online-mode” value=”true”/>
<sysproperty key=”db.commitOnSync” value=”false”/>
<sysproperty key=”db.bufferedQueue” value=”true”/>
<sysproperty key=”db.batch-size” value=”200″/>

Sample test output:

[oracle@test ols]$ ant test-parallel-ols
Buildfile: build.xml

test-parallel-ols:
[junit] Running com.scotas.solr.index.TestDBIndexParallel
[junit] Table created: T1
[junit] Index altered: B$SIT1
[junit] OnLine: true CommitOnSync: false
[junit] Avg time on Solr deletes: 2
[junit] Avg time on Solr deletes: 1
[junit] Avg time on Solr updates: 2
[junit] Avg time on Solr updates: 2
[junit] Avg time on Solr updates: 2
[junit] Avg time on Solr deletes: 16
[junit] Avg time on Solr inserts: 0
[junit] Avg time on Solr inserts: 0
[junit] Avg time on Solr inserts: 0
[junit] Avg time on Solr count hits: 26
[junit] Avg time on Solr search: 33
[junit] Avg time on Solr search: 33
[junit] Avg time on Solr count hits: 27
[junit] Avg time on Solr search: 32
[junit] Avg time on Solr count hits: 27
[junit] Index droped: B$SIT1
[junit] Table droped: T1
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 310.189 sec

BUILD SUCCESSFUL
Total time: 5 minutes 10 seconds

And that’s all Scotas OLS works perfect using Oracle 12c on Docker.

Apache Solr tutorial in a relational way

Many questions to be answered when starting with Scotas products is the lack of knowledge on how to implement typical Solr operations in a SQL way or vice-versa.

The idea of this tutorial is to do the same thing explained into the Apache Solr tutorial but using relational concepts using Scotas OLS.

Test Transparent Data Encryption on Scotas’s OLS

This example shows how Scotas OLS helps you to protect your data. Following this guide any sensitive data will never be exposed on Solr/Lucene and your application is complaint against Health Insurance Portability and Accountability Act (HIPAA) and Payment Card Industry (PCI) data security standards

– First enable TDE on Oracle database (Wallet config should be enabled first) connected as SYS

SQL> ALTER SYSTEM SET ENCRYPTION WALLET CLOSE IDENTIFIED BY test;
SQL> ALTER SYSTEM SET ENCRYPTION WALLET OPEN IDENTIFIED BY test;

- Check data files location to create a new small one

SQL> SELECT RPAD(tablespace_name,18)||’ ‘||RTRIM(file_name) Tablespace FROM dba_data_files ORDER BY tablespace_name;

TABLESPACE
——————————————————————————–
EXAMPLE /u01/app/oracle/oradata/orcl/example01.dbf
SYSAUX /u01/app/oracle/oradata/orcl/sysaux01.dbf
SYSTEM /u01/app/oracle/oradata/orcl/system01.dbf
UNDOTBS1 /u01/app/oracle/oradata/orcl/undotbs01.dbf
USERS /u01/app/oracle/oradata/orcl/users01.dbf


SQLCREATE TABLESPACE TEST
               DATAFILE ‘/u01/app/oracle/oradata/orcl/test01.dbf’ SIZE 128K
               AUTOEXTEND ON NEXT 64K;


SQL> CREATE USER test identified by test
               temporary tablespace temp
               default tablespace test
               quota unlimited on test;

 – Connect and test a non-secure storage by checking some text stored unencrypted on the datafile.

SQL> GRANT connect,resource,luceneuser TO test;

SQL> CONN test/test@orcl

SQL> CREATE TABLE tde_test (
               id NUMBER(10),
               data VARCHAR2(50)
               )
              TABLESPACE test;

SQL> INSERT INTO tde_test (id, data) VALUES (1, ‘This is a secret!’);

SQL> commit;

SQL> CREATE index tde_idx on tde_test(id)
               indextype is Lucene.SolrIndex
               parameters(‘LogLevel:INFO;IncludeMasterColumn:false;ExtraCols:data;HighlightColumn:DATA‘);

SQL> SELECT count(*) from tde_test where scontains(id,’secret’)>0;
             COUNT(*)
             ————
                     1

– Table and index was created, we can connect as SYS and flush any memory structure to disk and verify the string using grep utility.

SQL> conn / as sysdba
SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;
              System altered.
SQL> !grep “This is a secret” /u01/app/oracle/oradata/orcl/test01.dbf
             Binary file /u01/app/oracle/oradata/orcl/test01.dbf matches

- Drop the user and datafile and start again the test


SQL> ALTER SYSTEM SET WALLET CLOSE IDENTIFIED BY test;

SQL> ALTER SYSTEM SET WALLET OPEN IDENTIFIED BY test;

SQL> drop user test cascade;

SQL> DROP TABLESPACE test INCLUDING CONTENTS AND DATAFILES;

————————– recreate here the tablespace test and the user ———————

– Connect again using the new user but now using TDE on the column and the index.

SQL> test/test@orcl

SQL> CREATE TABLE tde_test (
               id NUMBER(10),
               data VARCHAR2(50) ENCRYPT
               )
              TABLESPACE test;

SQL> INSERT INTO tde_test (id, data) VALUES (1, ‘This is a secret!’);

SQL> COMMIT;

SQL> SELECT * FROM tde_test;


ID DATA
———- ————————————————–
1 This is a secret!


SQL> create index tde_idx on tde_test(id)
               indextype is Lucene.SolrIndex
               parameters(‘PopulateIndex:false;IncludeMasterColumn:false;ExtraCols:data;HighlightColumn:DATA’);

SQL> select count(*) from tde_test where scontains(id,’secret’)>0;

COUNT(*)
———-
0

– PopulateIndex:false was used to not fill the Solr/Lucene structured until the encryption is enabled on the LOB column which stored Lucene files.

SQL> alter table tde_idx$t modify lob(data) (encrypt using ‘3des168′);

SQL> alter index tde_idx rebuild parameters(‘LogLevel:INFO’);

SQL> select count(*) from tde_test where scontains(id,’secret’)>0;

COUNT(*)
———-
1


SQL> select sscore(1),shighlight(1) from tde_test where scontains(id,’secret’,’score desc’,1)>0;

SSCORE(1)  SHIGHLIGHT(1)
———-        —————————-—————————————————-
.306852818 This is a <em>secret</em>!

– Now connect again and verify that sensitive text are not visible on the table information and the index structures

SQL> conn / as sysdba

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE;

System altered.

– Now grep utility can’t find any matches on the complete text or part of them (keywords)

SQL> !grep “This is a secret” /u01/app/oracle/oradata/orcl/test01.dbf

SQL> !grep “secret” /u01/app/oracle/oradata/orcl/test01.dbf

Conclusion: All sensitive data will never be exposed on Solr/Lucene and your application is complaint against Health Insurance Portability and Accountability Act (HIPAA) and Payment Card Industry (PCI) data security standards.

How to…? – Scotas and Apex

This example  shows how easily you can add typical free text searching features such as faceting, more like this, auto-complete among others using Scotas Push Connector.

The example was tested with:

Assumptions:

  • You already have installed Scotas PC.
  • You have created one user, in our case “scott”.
  • You have craeted a table FILM_ADMIN_TEST (Download here).
  • Solr instance running in localhost, port 8983.

Step one – Indexing

  • as sysdba,  grant role PCUSER to the user scott. 

                      SQL> grant PCUSER to scott;

  • as scott user, create an PC index on FILM_ADMIN_TEST table

SQL> CREATE INDEX FILM_ADMIN_TEST_SIDX ON SCOTT.FILM_ADMIN_TEST(TEXT) INDEXTYPE IS PC.SOLR PARAMETERS(‘{Updater: “localhost@8983″, Searcher: “localhost@8983″, CommitOnSync:true, SyncMode:OnLine, LockMasterTable:false, IncludeMasterColumn:false, HighlightColumn:” \”text_t\””, DefaultColumn:”text”, ExtraCols:”TEXT \”text_t\”,GENRE \”genre_s\”,COUNTRY \”country_s\”,LANGUAGE \”language_s\”,DIRECTED_BY \”directed_by_s\””}’);

Note: this sentence create an index over the field “text” and also, we included other fields to get facets.

Step two – Apex, the demo application should be ready to use on the Oracle XE installation.

  • Edit Apex Appplication login into

http://127.0.0.1:8080/apex/

  • Select Application Builder ->Create
  • Enter a name for the application.

Step 01

 

  • Create a new Page

Select a Report form type, and use SQL query in the Page Source.  Insert the Sql query with the Scotas facets.

Query:

SELECT QRYTEXT AS LANGUAGE, HITS AS HITS FROM TABLE( facet_to_tbl( json_list( json( SOLRPUSHCONNECTOR.FACET(‘FILM_ADMIN_TEST_SIDX’, ‘*’, ‘facet.field=language_s’).FIELDS).get(1))))

Note: Push Connector returns facets in json format. To use facets as a table, we provide a the function facet_to_tbl.

  • Select add page

Select the button “Add Page”, to create this page. After that, you should see the page in the list.

 

  • Create tabs

 

  • Shared components

 

  • Application Attributes

 

  • User interface

 

  • Create Application

 

Step two – Run application

Run the application using your username/password, in our case scott/tiger.

The page will look like:

here yo for example doing the result clickable to select Laguage, or filtering by range.

Conclusion: This is only a simple example to use Scotas features with Appex. From now you can make use of  the best of each technologies.

Scotas OLS and Oracle BigData architecture

Following the Oracle Information Architecture: An Architect’s Guide to Big Data.

I modified the figure 3 to show how Scotas OLS product fit into this reference architecture.

as you can see Scotas OLS Search integration is at the decision layer proving:

  • Rich free text searching a la Google
  • Faceted Search and Filtering
  • Scalable index maintenance, no index fragmentation, NRT data changes
  • Advanced, Configurable Text Analysis
  • A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys
  • Powerful Extensions to the Lucene Query Language
  • Geospatial Search

with these additions decision makers have in hand a native SQL integrated Solr solution which can be used on any Oracle Analytic tool.

5 minutes tutorial with ElasticSearch Push Connector

Continuing with the 5 minutes tutorial started in previous post this is about how to start working with Scotas ElasticSearch Push Connector technology.

After you install the Push Connector and your Elastic Search (ES) server, you could do a simple test using examples included into the script located at your distribution directory – db/tutorial/esTutorial.sql script.

Assuming that the ES server is installed by default we can start them using -f flag to see the log output at the console, for example:

[mochoa@localhost elasticsearch-0.19.4]$ export JAVA_HOME=/usr/local/jdk1.6
[mochoa@localhost elasticsearch-0.19.4]$ export PATH=$JAVA_HOME/bin:$PATH
[mochoa@localhost elasticsearch-0.19.4]$ bin/elasticsearch -f
[2012-06-22 11:00:40,089][INFO ][node ] [Grim Hunter] {0.19.4}[4061]: initializing …

[2012-06-22 11:00:46,899][INFO ][http ] [Grim Hunter] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.100.55:9200]}
[2012-06-22 11:00:46,900][INFO ][node ] [Grim Hunter] {0.19.4}[4061]: started
[2012-06-22 11:00:46,907][INFO ][gateway ] [Grim Hunter] recovered [0] indices into cluster_state

As you see above, ES server is listening on 9200 TCP/IP port which is the default value used by the Push Connector (PC).

First, on another console we need to connect with an user with DBA role to grant the PCUSER role to scott. Then, we are able to start working with the connector examples using the traditional scott user, let see:

-bash-4.2$ sqlplus “/ as sysdba”

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jun 22 11:01:10 2012

Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> grant PCUSER TO scott;

Grant succeeded.

Now we are ready to start creating a table to store a simple tweet structure as is shown in many ES examples.

CREATE TABLE ES_TUTORIAL (
USER_ID VARCHAR2(30) PRIMARY KEY,
MESSAGE VARCHAR2(140),
POST_DATE TIMESTAMP);

and populate it with some rows:

insert into ES_TUTORIAL values (
‘kimchy’,
‘trying out Elastic Search’,
TO_TIMESTAMP(‘2009-11-15T14:12:12′,’YYYY-MM-DD”T”HH24:MI:SS’));

insert into ES_TUTORIAL (
select ‘kimchy-‘||rownum,’trying out Elastic Search ‘||rownum,TO_TIMESTAMP(sysdate+rownum,’YYYY-MM-DD”T”HH24:MI:SS’) from dual connect by rownum <= 100);

insert into ES_TUTORIAL (
select ‘kimchy-‘||(rownum+100),’trying out Elastic Search ‘||rownum,TO_TIMESTAMP(sysdate+rownum,’YYYY-MM-DD”T”HH24:MI:SS’) from dual connect by rownum <= 100);

insert into ES_TUTORIAL (
select ‘kimchy-‘||(rownum+200),’trying out Elastic Search ‘||rownum,TO_TIMESTAMP(sysdate+rownum,’YYYY-MM-DD”T”HH24:MI:SS’) from dual connect by rownum <= 100);

commit; 

Note that we are using TO_TIMESTAMP SQL function to format an Oracle timestamp value to a Java compatible timestamp format.

Now we are ready to index the above table:

[mochoa@localhost pc]$ sqlplus scott/tiger@test

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jun 22 11:38:21 2012

Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> set define off timing on serverout on
SQL> CREATE INDEX TWEET_ESIDX ON ES_TUTORIAL(MESSAGE) INDEXTYPE IS PC.ES
PARAMETERS(‘{Settings:{“number_of_shards” : 1},SyncMode:OnLine,LockMasterTable:false,IncludeMasterColumn:false,ExtraCols:”user_id \”user\”,to_char(post_date,”YYYY-MM-DD\”T\”HH24:MI:SS”) \”post_date\”,message \”message\””}’);

Index created.

Elapsed: 00:00:00.74

we can see the output at the ES console telling us that a new index has been created:

[2012-06-22 11:38:47,837][INFO ][cluster.metadata ] [Grim Hunter] [scott.tweet_esidx] creating index, cause [api], shards [1]/[1], mappings []
[2012-06-22 11:38:48,404][INFO ][cluster.metadata ] [Grim Hunter] [scott.tweet_esidx] update_mapping [row] (dynamic)

Note that the index was created at ES server with the sintax [schema].[index_name] in lower case.

Here the explanation of the parameters used during index creation time:

  • Settings, any ES settings parameter in the example the Sharding configuration
  • SyncMode, OnLine means Near Real Time (NRT) mode for inserts/updates and Real Time (RT) mode for deletes, if you need information about this configuration please read the Scotas OLS White Paper, section Working modes replacing Solr by ES.
  • LockMasterTable, false means that during the table scan for selecting rows there is no row level locking, when this parameter is true rows are selected with “for update” mode.
  • IncludedMasterColumn, false means that the values of the master column (defined in create index syntax, MESSAGE) will not be included in the index structure, please see next parameter.
  • ExtraCols, is the list of the columns, formats and field names used at ES index structure. Above example shows that ES index will include three fields user, post_date and message. Note that field are case sensitive and now we are lower casing it.

Now the index is ready to use and has been populated with the 301 rows which are into the table structure. To check if all rows were properly indexed we can query for the rowid field, this field is always indexed and works as id into the ES index, let see:

SQL> select ElasticSearch.countHits(‘TWEET_ESIDX’,’rowid:*’) from dual;
301
Elapsed: 00:00:00.04

also we can select rows that match with a specific string, for example look for the 53 string:

SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:53‘,1)>0;
2.327857                      trying out Elastic Search 53
2.327857                      trying out Elastic Search 53
2.327857                      trying out Elastic Search 53
Elapsed: 00:00:00.14

a few comments about the above syntax for querying ES:

  • econtains, operator is used to query the ES using the Push Connector,
    first argument is the master column of the index, the one which is included into the create index syntax
    second argument is the query string using ES q parameter in the _search? url.
    third argument is the ancillary operator identifier (used by escore(), see next item)
    return value is >0 when the row match against the input query
  • escore, is used to get the score computed by econtains() operator at the specific row, note the input parameter must be correlated to the last argument in econtains().

Now we are trying to explain what the NRT means in a few queries, let see what happens during an update:

SQL> update ES_TUTORIAL set message=message||’-mod‘ where rownum<=10;
10 rows updated.

the RDBMS tell us that 10 rows where updated but NRT deferred this update into the ES server until we commit the transaction, we can easily check using above select to see if it returns mod as positive hit:

SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod‘,1)>0;
no rows selected

now what happens when we commit pending changes at the current transaction:

SQL> commit;
Commit complete.
SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod‘,1)>0;
no rows selected
SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod‘,1)>0;

1.6282115     trying out Elastic Search 76-mod
1.6282115     trying out Elastic Search 77-mod
1.6282115     trying out Elastic Search 78-mod
1.6282115     trying out Elastic Search 79-mod
1.6282115     trying out Elastic Search 80-mod
1.6282115     trying out Elastic Search 81-mod
1.6282115     trying out Elastic Search 82-mod
1.6282115     trying out Elastic Search 83-mod
1.6282115     trying out Elastic Search 84-mod
1.6282115     trying out Elastic Search 85-mod
10 rows selected.

after a few milliseconds the ES layer receives the update and rows are returned as positive hits.

Now we can test RT deletes, for example:

SQL> delete from es_tutorial where rowid in (select rowid from ES_TUTORIAL where econtains(message,’message:mod‘,1)>0);
10 rows deleted.
SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod‘,1)>0;
no rows selected

as you can see after 10 rows where deleted, these rows where not returned as positive hits immediately (Real Time), for the current transaction these 10 rows are not available except if we do a rollback.

To checks what happens at the ES side we can open another concurrent transaction in parallel and check if these rows are visible for other users (remember that the transaction is not committed yet), let see a second connection:

SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod’,1)>0;
1.6282115     trying out Elastic Search 76-mod
…..
1.6282115     trying out Elastic Search 85-mod
10 rows selected.

now we will commit deleted rows at  first connection:

SQL> commit;
Commit complete.
SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod’,1)>0;
no rows selected

and see what happens at the second connection:

SQL> select escore(1),message from ES_TUTORIAL where econtains(message,’message:mod’,1)>0;
no rows selected

Second connection gets no hits because rows where deleted definitely and the ES server receives the changes.

Finally by dropping the index, the ES server drops the index structure too, here the SQL command:

SQL> drop index TWEET_ESIDX;
Index dropped.

and the ES server console:

[2012-06-22 14:52:59,460][INFO ][cluster.metadata ] [Grim Hunter] [scott.tweet_esidx] deleting index

and that’s all, as you can see Elastic Search Push Connector works as all Scotas searching products, functionality not covered into this blog post where faceting, sorting using score or others fields, detecting changes on multiples columns, etc. but they are supported by the connector.

 

5 minutes tutorial with Scotas OLS

If you already have Scotas OLS installed this is a 5 minutes guide for using Oracle-Solr integration.

Scotas OLS binary distribution have a sample configuration files located at db/tutorial directory which are used to simulate all operations described into Solr Tutorial but using Scotas OLS.

Here a complete list of steps:

1) grant LUCENEUSER role to SCOTT (this step could be avoided if was already done):

-bash-4.2$ sqlplus “/ as sysdba”
SQL*Plus: Release 11.2.0.3.0 Production on Thu May 10 08:07:51 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> grant LUCENEUSER to SCOTT;
Grant succeeded.

2) upload Solr related files to XMLDB repository (assuming that we will create an index named SCOTT.TUTORIAL_SIDX):

# cd db/tutorial/LUCENE.TUTORIAL_SIDX/conf
# ftp localhost 2100
Connected to localhost.localdomain.
220- localhost.localdomain
Unauthorised use of this FTP server is prohibited and may be subject to civil and criminal prosecution.
220 localhost.localdomain FTP Server (Oracle XML DB/Oracle Database) ready.
Name (localhost:mochoa): scott
331 pass required for SCOTT
Password: *****
230 SCOTT logged in
Remote system type is Unix.
ftp> cd public/solr
250 CWD Command successful
ftp> mkdir SCOTT.TUTORIAL_SIDX
250 CWD Command successful
ftp> cd SCOTT.TUTORIAL_SIDX
250 CWD Command successful
ftp> mkdir conf
250 CWD Command successful
ftp> cd conf
250 CWD Command successful
ftp> mput *
mput elevate.xml [anpqy?]? a
………
ftp> dir
227 Entering Passive Mode (127,0,0,1,249,27)
150 ASCII Data Connection
-rw-r–r– 1 SCOTT oracle 1274 MAY 10 11:15 elevate.xml
-rw-r–r– 1 SCOTT oracle 78514 MAY 10 11:15 mapping-FoldToASCII.txt
-rw-r–r– 1 SCOTT oracle 2868 MAY 10 11:15 mapping-ISOLatin1Accent.txt
-rw-r–r– 1 SCOTT oracle 367 MAY 10 11:15 mb_attributes.txt
-rw-r–r– 1 SCOTT oracle 121 MAY 10 11:15 mb_letterBuckets.txt
-rw-r–r– 1 SCOTT oracle 873 MAY 10 11:15 protwords.txt
-rw-r–r– 1 SCOTT oracle 30717 MAY 10 11:15 schema.xml
-rw-r–r– 1 SCOTT oracle 921 APR 13 13:46 scripts.conf
-rw-r–r– 1 SCOTT oracle 13267 MAY 10 11:15 solrconfig.xml
-rw-r–r– 1 SCOTT oracle 13 MAY 10 11:15 spellings.txt
-rw-r–r– 1 SCOTT oracle 1171 MAY 10 11:15 stopwords.txt
-rw-r–r– 1 SCOTT oracle 1132 MAY 10 11:15 synonyms.txt
226 ASCII Transfer Complete
ftp> pwd
257 “/public/solr/SCOTT.TUTORIAL_SIDX/conf” is current directory.
ftp> quit

3) Create and populate a sample table. This table have a relational version of the Solr tutorial documents (these SQL statements are at db/tutorial/schemaTutorial.sql file).

[mochoa@localhost tutorial]$ sqlplus scott/tiger@orcl
SQL*Plus: Release 11.2.0.3.0 Production on Thu May 10 08:22:44 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> set define off
SQL> CREATE TABLE OLS_TUTORIAL ( ID VARCHAR2(30) PRIMARY KEY,
NAME VARCHAR2(400),
MANU VARCHAR2(4000),
CAT VARCHAR2(400),
FEATURES CLOB,
INCLUDES VARCHAR2(4000),
WEIGHT NUMBER,
PRICE NUMBER,
POPULARITY NUMBER,
INSTOCK CHAR(5), — true or false
MANUFACTUREDATE_DT TIMESTAMP,
PAYLOADS VARCHAR2(4000),
STORE VARCHAR2(200)) ;
Table created.
SQL> insert into ols_tutorial values ( ‘SP2514N’, ‘Samsung SpinPoint P120 SP2514N – hard drive – 250 GB – ATA-133′, ‘Samsung Electronics Co. Ltd.’, ‘electronics hard-drive’, ‘7200RPM, 8MB cache, IDE Ultra ATA-133 NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor’, NULL, NULL, 92, 6, ‘true’, TO_TIMESTAMP(‘2006-02-13T15:26:37Z’,’YYYY-MM-DD”T”HH24:MI:SS”Z”‘), NULL, ‘35.0752,-97.032′) ;
…….
1 row created.
SQL> commit;
Commit complete.
SQL> select count(*) from OLS_TUTORIAL;
COUNT(*)
———-
16

4) Indexing using Scotas OLS.

SQL> CREATE INDEX TUTORIAL_SIDX ON OLS_TUTORIAL(ID) INDEXTYPE IS LUCENE.SOLRINDEX
parameters(‘Updater:localhost@9099;Searcher:localhost@9099;SyncMode:OnLine;LockMasterTable:false;IncludeMasterColumn:false;LogLevel:ALL;MltColumn:title;HighlightColumn:name,features;DefaultColumn:text;ExtraCols:id “id”,cat “cat”,name “name”,features “features”,manu “manu”,includes “includes”,price “price”,popularity “popularity”,inStock “inStock”,to_char(manufacturedate_dt,”YYYY-MM-DD”T”HH24:MI:SS”Z””) “manufacturedate_dt”‘);
Index created.

5) Doing some queries (equivalent to the section  …and now you can search for all sorts of Solr Tutorial, there are more query examples at db/tutorial/testTutorial.xml file in a form o SQLUnit syntax):

SQL> SELECT /*+ DOMAIN_INDEX_SORT */ id FROM OLS_TUTORIAL T where scontains(id,’video’)>0;
ID
——————————
MA147LL/A
100-435805
EN7800GTX/2DHTV/256M

SQL> SELECT id FROM OLS_TUTORIAL T WHERE SCONTAINS(ID,’name:video’)>0;
ID
——————————
MA147LL/A
100-435805

SQL> SELECT id,price FROM OLS_TUTORIAL WHERE SCONTAINS(ID,’video AND price:[* TO 400]’)>0;
ID PRICE
—————————— ———-
MA147LL/A                           399

6) Other complex queries (Faceted Search example):

SQL> SELECT FIELD,SJOIN(T.FACETS) F FROM TABLE(SFACETS(USER||’.TUTORIAL_SIDX’,’*:*’,’facet.field=cat&facet.field=inStock’)) T;
cat
electronics(14),memory(3),connector(2),graphics-card(2),hard-drive(2),monitor(2) ,search(2),software(2),camera(1),copier(1),multifunction-printer(1),music(1),pri nter(1),scanner(1)
inStock
true(12),false(4)

7) More queries (Text Analysis)

SQL> SELECT /*+ DOMAIN_INDEX_SORT */ SSCORE(1),name FROM OLS_TUTORIAL T WHERE SCONTAINS(ID,’power-shot‘,1)>0;
.683216989
Canon PowerShot SD500

SQL> SELECT /*+ DOMAIN_INDEX_SORT */ SSCORE(1),NAME,features FROM OLS_TUTORIAL T WHERE SCONTAINS(ID,’features:recharging‘,1)>0;
.336813927
Apple 60 GB iPod with Video Playback Black iTunes, Podcasts, Audiobooks Stores up to 15,000 songs, 25,000 photos, or 150 hours of video 2.5-inch, 320×240 color TFT LCD display with LED backlight Up to 20 hours of battery life Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video Notes, Calendar, Phone book, Hold button, Date display, Photo wallet, Built-in g ames, JPEG photo playback, Upgradeable firmware, USB 2.0 compatibility, Playback speed control, Rechargeable capability, Battery level indication

And that’s all. Here some notes on the index parameters defined above:

  • IncludeMasterColumn:false, defined that the column ID which is the master column of the index defined when the SQL DDL is used as CREATE INDEX TUTORIAL_SIDX ON OLS_TUTORIAL(ID)… is not included during index construction as Solr field.
  • MltColumn:title, defines which column is used with More Like This ancillary operator mlt(), not showed at the examples.
  • HighlightColumn:name,features, columns used for highlighting with shighlight() ancillary operator, not showed at the examples.
  • DefaultColumn:text, which column is used as default for searching when using Solr Query Syntax at scontains() operator, note that text is not table column, is a virtual column defined at schema.xml with the Solr functionality copyField and works as bag including all Solr fields (cat, name, manu, features, includes).
  • ExtraCols:id “id“,cat “cat“,name “name“,features “features“,manu “manu“,includes “includes“,price “price“,popularity “popularity“,inStock “inStock“,to_char(manufacturedate_dt,”YYYY-MM-DD”T”HH24:MI:SS”Z””) “manufacturedate_dt“, these are table columns defined as Solr field, note that we are using double quoted (“) to defined case sensitive Solr Field names and an special to_char mask to format dates includes in manufacterdate_dt column in Solr date format.