Push Connector 2.0 for Oracle XE 18c

At the end of 2018 Oracle finally released a long awaited 18c XE release for Linux and a few days ago Windows version.

This new release included a native support for Json inside the Oracle Database this is a good new because:

  •  standardize the API (no more dependency over PLJson Library or Apex JSON)
  • performance, this support is complete implemented in C

using this new feature We complete rewrite Scotas Push Connector for Solr using the new API and due XE is free, Scotas Push connector is delivered also as Open Source version with the source at GitHub!!!

To deploy this new version is simple as deploy a new Docker stack on your Swarm cluster or single server, a sample stack is at docker-compose.yml file. Here main components:

Screenshot from 2019-02-26 10-53-15

 

 

 

 

 

 

 

 

first part of the stack is to start a new Solr instance to assist Solr Push connector searching facilities, remember that Push Connector implements Searching functionality using outside instance of Apache Solr.

Screenshot from 2019-02-26 10-56-03

 

 

 

 

second service is an Oracle RDBMS instance with two important parts, an external storage for the database data as the Oracle Docker image suggest, you can do the same with the persistent storage of Apache Solr image, also We define as environment variables the initial SYS password for the XE instance (this is mandatory during PC installation) and the hostname and port for Solr instance.

Finally We provide into the stack the startup scripts for installing Scotas Push Connector 2.0.0, this scripts are automatically called just after the Oracle XE instance is created.

Screenshot from 2019-02-26 11-04-23

 

 

 

to deploy this stack either in Swarm cluster or a simple node with Docker installed just proceed with these steps:

  • Checkout GitHub scripts

$ git clone https://github.com/scotas/docker-images.git
$ cd docker-images/pc-18cXE

  • Create configs for Oracle XE startup scripts

$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/00-unzip-pc.sh|docker config create 00-unzip-pc –
$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/01-pc-ins.sh|docker config create 01-pc-ins –
$ curl https://raw.githubusercontent.com/scotas/docker-images/master/pc-18cXE/02-clean-up-files.sh|docker config create 02-clean-up-files -

  • Deploy sample stack

$ cd ../sample-stacks
$ docker stack deploy pc -c docker-compose-pc.yml
$ docker service logs -f pc_db
pc_db.1.y7826xns4ig6@pocho | ORACLE PASSWORD FOR SYS AND SYSTEM: Oracle_2018
….
pc_db.1.y7826xns4ig6@pocho | old 33: insert into PC.bg_process (bg_process_name,host_name,port) values (‘Default &connector Server’,’&newhost’,&newport);
pc_db.1.y7826xns4ig6@pocho | new 33: insert into PC.bg_process (bg_process_name,host_name,port) values (‘Default solr Server’,’solr’,8983);
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | PL/SQL procedure successfully completed.
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | Elapsed: 00:00:10.63
pc_db.1.y7826xns4ig6@pocho | Disconnected from Oracle Database 18c Express Edition Release 18.0.0.0.0 – Production
pc_db.1.y7826xns4ig6@pocho | Version 18.4.0.0.0
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | /opt/oracle/runOracle.sh: running /opt/oracle/scripts/setup/02-clean-up-files.sh
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | DONE: Executing user defined scripts
pc_db.1.y7826xns4ig6@pocho |
pc_db.1.y7826xns4ig6@pocho | The Oracle base remains unchanged with value /opt/oracle
pc_db.1.y7826xns4ig6@pocho | #########################
pc_db.1.y7826xns4ig6@pocho | DATABASE IS READY TO USE!
pc_db.1.y7826xns4ig6@pocho | #########################


Testing your Solr Push connector

Once you have the stack deployed and your XE ready to use a simple test could be executed connecting to Oracle and creating an scott account, for example:

$ /opt/sqlcl/bin/sql
Username? (”?) sys/Oracle_2018@localhost:1521/XEPDB1 as sysdba
Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 – Production
SQL> create user scott identified by tiger
2 default tablespace users
3 temporary tablespace temp
4* quota unlimited on users;
User SCOTT created.

SQL> grant connect,resource,pcuser to scott;

Grant succeeded.


Preparing Solr to index a sample table

Unlike Scotas OLS which use Apache Solr inside the RDBMS, Push Connector technology relies on an external Solr instance to prepare Solr mappings for storing a sample table indexing just do these steps in shell inside the Solr instance:

$ docker exec -ti pc_solr.1.tn78o19x9qov3635yhkyyea6r /bin/bash
solr@solr:/opt/solr$ bin/solr create_core -c source_big_pidx
….
Created new core ‘source_big_pidx’
solr@solr:/opt/solr$ curl -X POST -H ‘Content-type:application/json’ –data-binary ‘{
> “add-field”:{
> “name”:”rowid”,
> “type”:”string”,
> “indexed”:true,
> “stored”:true,
> “required”:true,
> “multiValued”:false
> },
> “add-field”:{
> “name”:”solridx”,
> “type”:”string”,
> “indexed”:true,
> “stored”:true,
> “required”:true,
> “multiValued”:false
> }
> }’ http://localhost:8983/solr/source_big_pidx/schema
{
“responseHeader”:{
“status”:0,
“QTime”:336}}
solr@solr:/opt/solr$ sed -i ‘s/<uniqueKey>id/<uniqueKey>rowid/g’ /opt/solr/server/solr/source_big_pidx/conf/managed-schema
solr@solr:/opt/solr$ curl “http://localhost:8983/solr/admin/cores?action=RELOAD&core=source_big_pidx”
{
“responseHeader”:{
“status”:0,
“QTime”:1253}}
solr@solr:/opt/solr$ curl -X POST -H ‘Content-type:application/json’ –data-binary ‘{
> “add-field”:{
> “name”:”title”,
> “type”:”text_general”,
> “indexed”:true,
> “stored”:true,
> “multiValued”:true
> },
> “add-copy-field” :{
> “source”:”*”,
> “dest”:”_text_”
> }
> }’ http://localhost:8983/solr/source_big_pidx/schema
{
“responseHeader”:{
“status”:0,
“QTime”:188}}
solr@solr:/opt/solr$ curl http://localhost:8983/solr/source_big_pidx/schema/uniquekey?wt=json
{
“responseHeader”:{
“status”:0,
“QTime”:0},
“uniqueKey”:”rowid”}


Index your first table

Once Apache Solr is ready to receive Push Connector indexing information We can do these:

SQL> conn scott/tiger@localhost:1521/XEPDB1
Connected.
SQL> set long 10000 lines 140 pages 50 timing on echo on
SQL> set serveroutput on size 1000000
SQL> create table test_source_big as (
2 select owner,name,type,line,text from (select rownum as ntop_pos,q.* from
3 (select * from all_source) q)
4 where ntop_pos>=0 and ntop_pos<=5000
5 );

Table TEST_SOURCE_BIG created.

Elapsed: 00:00:05.460
SQL> CREATE INDEX SOURCE_BIG_PIDX ON TEST_SOURCE_BIG(TEXT)
2 INDEXTYPE IS PC.SOLR
3 parameters(‘{LogLevel:”INFO”,
4 Updater:”solr@8983″,
5 Searcher:”solr@8983″,
6 SolrBase:”solr/source_big_pidx“,
7 SyncMode:”OnLine”,
8 BatchCount:5000,
9 CommitOnSync:true,
10 LockMasterTable:false,
11 IncludeMasterColumn:false,
12 DefaultColumn:”text”,
13 HighlightColumn:”title”,
14 ExtraCols:””text” value text,”title” value substr(text,1,256),”line_i” value line,”type_s” value type”}’);

Index SOURCE_BIG_PIDX created.

Elapsed: 00:00:05.923

Note that ExtraCols parameter is using latest json_object column syntax.


Doing some queries

If you look at the logs of Solr instance during the indexing time you will see something like:

$ docker service logs -f pc_solr

pc_solr.1.tn78o19x9qov@pocho | 2019-01-28 13:00:50.281 INFO (qtp735937428-21) [ x:source_big_pidx] o.a.s.u.DirectUpdateHandler2 end_commit_flush
pc_solr.1.tn78o19x9qov@pocho | 2019-01-28 13:00:50.285 INFO (qtp735937428-21) [ x:source_big_pidx] o.a.s.u.p.LogUpdateProcessorFactory [source_big_pidx] webapp=/solr path=/update/json params={waitSearcher=true&ident=on&commit=true&softCommit=false&wt=json&expungeDeletes=false}{add=[AAAR79AAMAAAADjAAA (1623909145638862848), AAAR79AAMAAAADjAAB (1623909145757351936), AAAR79AAMAAAADjAAC (1623909145761546240), AAAR79AAMAAAADjAAD (1623909145765740544), AAAR79AAMAAAADjAAE (1623909145769934848), AAAR79AAMAAAADjAAF (1623909145772032000), AAAR79AAMAAAADjAAG (1623909145775177728), AAAR79AAMAAAADjAAH (1623909145778323456), AAAR79AAMAAAADjAAI (1623909145781469184), AAAR79AAMAAAADjAAJ (1623909145785663488), … (5000 adds)],commit=} 0 6174

it tell you that Apache Solr is receiving from XE instance batchs of 5000 rows for indexing.

After that here a sample outputs of querying Oracle with Scotas Push connector domain index functionality, count hits:

SQL> select SolrPushConnector.countHits(‘SOURCE_BIG_PIDX’,’*:*’) from dual;

SOLRPUSHCONNECTOR.COUNTHITS(‘SOURCE_BIG_PIDX’,’*:*’)
—————————————————-
5000

Elapsed: 00:00:00.030

query with highlighting:

SQL> select sscore(1),shighlight(1) from test_source_big where scontains(text,'”procedure java”~10′,1)>0 order by sscore(1) desc;

SSCORE(1)
———-
SHIGHLIGHT(1)
——————————————————————————————————————————————–
1.3910314
{“title”:[” — <em>procedure</em> executing the recursive statement via DBMS_SQL, a <em>Java</em> stored\n”]}
Elapsed: 00:00:00.347

domain index pagination:

SQL> select /*+ DOMAIN_INDEX_SORT */ sscore(1) sc,text from test_source_big where scontains(text,’rownum:[1 TO 10] AND function’,1)>0 order by sscore(1) asc;
1.3612658
member function getBlobVal(csid IN number, pflag IN number, indent IN number) return BLOB deterministic parallel_enable
…………
1.5529456
If it is based on a user defined type, this function will return
10 rows selected.

Elapsed: 00:00:00.112

faceting and pivot:

SQL> declare
2 obj f_info;
3 BEGIN
4 obj := SolrPushConnector.facet(‘SOURCE_BIG_PIDX’,null,’facet.field=type_s&facet.limit=5&facet.pivot=type_s,line_i’);
5 dbms_output.put_line(obj.fields.to_string);
6 dbms_output.put_line(obj.pivots.to_string);
7 END;
8 /
{“type_s”:[“PACKAGE”,13493,”TYPE”,6438,”FUNCTION”,64,”PROCEDURE”,5]}
{“type_s,line_i”:[{“field”:”type_s”,”value”:”PACKAGE”,”count”:13493,”pivot”:[{“field”:”line_i”,”value”:1,”count”:20},{“field”:”line_i”,”valu
e”:2,”count”:19},{“field”:”line_i”,”value”:3,”count”:19},{“field”:”line_i”,”value”:4,”count”:19},{“field”:”line_i”,”value”:5,”count”:19}]},{
“field”:”type_s”,”value”:”TYPE”,”count”:6438,”pivot”:[{“field”:”line_i”,”value”:1,”count”:456},{“field”:”line_i”,”value”:2,”count”:429},{“fi
eld”:”line_i”,”value”:3,”count”:346},{“field”:”line_i”,”value”:4,”count”:295},{“field”:”line_i”,”value”:5,”count”:283}]},{“field”:”type_s”,”
value”:”FUNCTION”,”count”:64,”pivot”:[{“field”:”line_i”,”value”:1,”count”:14},{“field”:”line_i”,”value”:2,”count”:14},{“field”:”line_i”,”val
ue”:3,”count”:10},{“field”:”line_i”,”value”:4,”count”:6},{“field”:”line_i”,”value”:5,”count”:5}]},{“field”:”type_s”,”value”:”PROCEDURE”,”cou
nt”:5,”pivot”:[{“field”:”line_i”,”value”:1,”count”:1},{“field”:”line_i”,”value”:2,”count”:1},{“field”:”line_i”,”value”:3,”count”:1},{“field”
:”line_i”,”value”:4,”count”:1},{“field”:”line_i”,”value”:5,”count”:1}]}]}
PL/SQL procedure successfully completed.

Elapsed: 00:00:00.080

more samples at the source at the script testSourceBig.sql and tutorial directory.

Scotas OLS official Docker image build scripts

We uploaded to GitHub two official Docker build image scripts for Scotas OLS 2.0.1 (RDBMS 12c R1) and OLS 2.0.2 (RDBMS 12c R2).

Both scripts are based on Oracle official RDBMS 12c Docker build images scripts, and they make the installation of Oracle/Scotas OLS too easy.

It basically start by using your own Oracle 12c Docker image and install OLS’s binaries and license. Follow these steps to get your OLS up and running on Linux/Mac/Windows:

First check that you have the 12c image ready, for example:

[mochoa@localhost ols-12cr2]$ docker image ls|grep oracle
….
oracle/database                12.2.0.1-ee                 1a870e15682d                6 weeks ago            14.8 GB
….

then checkout our scripts from GitHub using:

[mochoa@localhost tmp]$ git clone https://github.com/scotas/docker-images.git
Cloning into ‘docker-images’…
remote: Counting objects: 39, done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 39 (delta 14), reused 32 (delta 10), pack-reused 0
Unpacking objects: 100% (39/39), done.
Checking connectivity… done.
[mochoa@localhost tmp]$ cd docker-images/ols-12cr2/
[mochoa@localhost ols-12cr2]$ ./buildDockerImage.sh

.. Docker build steps here (requires internet connection)..

once you have Scotas OLS image ready at your local Docker images repository you can install OLS at first startup of your Docker container using:

docker run –name <container name> \
-p <host port>:1521 -p <host port>:5500 \
-e ORACLE_SID=<your SID> \
-e ORACLE_PDB=<your PDB name> \
-e ORACLE_PWD=<your database passwords> \
-e ORACLE_CHARACTERSET=<your character set> \
-v [<host mount point>:]/opt/oracle/oradata \
ols-official:2.0.2

Important: unlike Oracle official Docker image script first run requires ORACLE_PWD argument, it doesn’t work with random generated password. During this first startup your RDBMS will create all the datafiles at <host mount point> directory and unpack and install Scotas OLS, you will see some like this:

ODCI patched
Installing OLS…
OLS installed see /home/oracle/install-OLS.log and /home/oracle/install-OLS.err files for details
OLS Installed OK….
#########################
DATABASE IS READY TO USE!
#########################

after a successful Database creation you can start/stop the container, rm and run again without the ORACLE_PWD argument.

Note: that the CDB and PDB$SEED are patched against an update version of ODCI.jar library, the included library for RDBMS 12c R1 and R2 are outdated and doesn’t have a correct wrapper files for latest ODCI library, see this post for more details.

Note: During first startup you will see an ora_m00n_pid process consuming a lot of CPU, this is because once you start using Java inside the Database a JIT compilation phase start converting Java code to assembler, it includes a complete Java/Oracle stack and Scotas OLS code, after a week of OLS usage all Java code must be compiled to assembler and this process only takes some CPU usage, obviously when all the code is compiled your indexing and query time will be faster.