Scaling OLS (Oracle/Solr Integration)

OLS have differents ways to deal with large data-sets by using some Oracle built in features available on RAC or multi-core environments.

A difference with Solr distributed search/sharding is that OLS is using a transactional/shared Lucene storage which means that if one process commits a change on them it is visible immediately to all other processes, is not necessary a copy post process, we can see this like Solr over SAN, but with additional of the SQL behavior.

To explain above functionality we can use the index optimize task. This task takes a long time to execute and can hit Out Of Memory exceptions if some Solr parameter are not fine tuned, lets see the example.

On a console 1 a running test-suite simulates multiple users requesting searches on a table which has around 600K rows:

[mochoa@localhost ols]$ ant test-queryhits-solr
Buildfile: build.xml
[echo] Building ols…
test-queryhits-solr:
[junit] Running com.scotas.solr.indexer.TestQueryHits
[junit] Hits: 36707

[junit] Elapsed time: 335
….
[junit] Query time: 59
[junit] Elapsed time: 12983
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 13.323 sec
BUILD SUCCESSFUL
Total time: 14 seconds

on a parallel console 2 an optimize task is started

[mochoa@localhost Download]$ sqlplus scott/tiger@orcl
SQL*Plus: Release 11.2.0.3.0 Production on Tue Apr 24 08:46:26 2012
Copyright (c) 1982, 2011, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> exec SolrDomainIndex.optimize(‘SOURCE_BIG_SIDX’);

and we see that the CPU usage of processor 4 is at the top of CPU (attached to the optimize task, PID 10567) and other Oracle process are on top of the execution processing request for searching from the client side (Java test-suite, oracleorcl[LOCAL=NO]) and the Shared Searcher process (ora_j010_orcl)

Note that other processes are not affected by the optimize task running with PID 10567 and to show what happens in the worst scenario we will kill the process trying to show a catastrophic problem, such as an OOM exception or a RAC node failure:

[mochoa@localhost ~]$ su
Password:
[root@localhost mochoa]# kill -9 10567
———————————————————————————
SQL> exec SolrDomainIndex.optimize(‘SOURCE_BIG_SIDX’);
BEGIN SolrDomainIndex.optimize(‘SOURCE_BIG_SIDX’); END;
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 10567
Session ID: 77 Serial number: 109
Elapsed: 00:01:05.96

by starting again the test suite, the Oracle RDBMS automatically did a rollback on any uncommitted changes on the Lucene storage so the optimization task didn’t break the storage and can be started again without any problem, lets see:

[mochoa@localhost ols]$ ant test-queryhits-solr
Buildfile: build.xml
[echo] Building ols…
test-queryhits-solr:

[junit] Running com.scotas.solr.indexer.TestQueryHits
[junit] Hits: 36707
[junit] Elapsed time: 275
….
[junit] Query time: 20
[junit] Elapsed time: 11293
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 11.571 sec
BUILD SUCCESSFUL
Total time: 12 seconds

SQL> set timing on
SQL> exec SolrDomainIndex.optimize(‘SOURCE_BIG_SIDX’);
PL/SQL procedure successfully completed.
Elapsed: 00:04:34.63
SQL> commit;
Commit complete.

Another feature of OLS is the parallel execution and could be used to speedup heavy indexes by configuring multiples processes listening on different ports and altering the index to use them, for example we have 3 OLS parallel/shared server processes running:

-bash-4.2$ netstat -anp|grep ora_j0
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 :::9099 :::* LISTEN 4479/ora_j010_orcl
tcp 0 0 :::9199 :::* LISTEN 4483/ora_j012_orcl
tcp 0 0 :::9299 :::* LISTEN 4485/ora_j013_orcl

now by executing an alter command above index will be configured to use two processes for reading, and one process for updating the index.

SQL> alter index SOURCE_BIG_SIDX parameters(‘Searcher:localhost@9099,localhost@9199;
Updater:localhost@9299′);
Index altered.
Elapsed: 00:00:00.28

any successful write operations committed by the writer process (PID 4485) will cause that the other two readers (PID 4479/4483) hot-reload their indexes structure in RAM to see changes transparently.

Above examples shows how easily OLS can scale on multi-core environments.

Another important  functionality is known as partitioning/sharding, OLS automatically inherits partitioning from tables on Oracle Enterprise Edition, for example we have a table with 24 partitions created as:

CREATE TABLE “SCOTT”.”MB_SALES”
( “ID” VARCHAR2(20 BYTE),
“S_DATE” DATE,
…..
“T_TRM_LOOKUPS” NUMBER(10,0)
) COMPRESS
PARTITION BY RANGE (s_date)
(
PARTITION SALES_2010_01 VALUES LESS THAN (TO_DATE(’01-FEB-2010′,’DD-MON-YYYY’)),
….
PARTITION SALES_2011_12 VALUES LESS THAN (TO_DATE(’01-JAN-2012′,’DD-MON-YYYY’)),
PARTITION SALES_2012 VALUES LESS THAN (MAXVALUE)
) NOLOGGING;

by creating an OLS index using:

CREATE INDEX SIDX ON MB_SALES (T_NAME)
INDEXTYPE IS LUCENE.SOLRINDEX
PARAMETERS (‘PopulateIndex:false;DefaultColumn:t_name;IncludeMasterColumn:false;
BatchCount:4000;LockMasterTable:false;LogLevel:ALL;SyncMode:OnLine;
HighlightColumn:t_name;ExtraCols:t_name “t_name”,t_duration “t_duration”,
t_a_name “t_a_name”,t_r_name “t_r_name”,t_r_attributes “r_type”,t_r_tracks “t_r_tracks”,
t_trm_lookups “t_trm_lookups”‘)
LOCAL parallel 2;

and setting PopulateIndex:false we can define later an specific parameter for each partition:

alter index SIDX REBUILD PARTITION SALES_2010_01 parameters (‘Searcher:localhost@9099;Updater:localhost@9099′);
alter index SIDX REBUILD PARTITION SALES_2010_02 parameters (‘Searcher:localhost@9199;Updater:localhost@9199′);

at the time of populating the index port 9099 is used for serving request on partition SALES_2010_01, 9199 for SALES_2010_02, and so on.

Finally, Oracle automatically controls which partitions of the shard are visited when doing a query, for example:

SQL> explain plan for
SELECT SHIGHLIGHT(1),SSCORE(1) FROM MB_SALES WHERE S_DATE BETWEEN TO_DATE(’01-JAN-2010′,’DD-MON-YYYY’) AND TO_DATE(’31-MAY-2010′,’DD-MON-YYYY’) AND SCONTAINS(T_NAME,’rownum:[1 TO 10] AND love’,1)>0
ORDER BY SSCORE(1) DESC;

……..

——————————————————–
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
——————————————————–
| 0 | SELECT STATEMENT | | 26026 | 1092K| | 2841 (1)| 00:00:35 | | |
| 1 | SORT ORDER BY | | 26026 | 1092K| 1448K| 2841 (1)| 00:00:35 | | |
| 2 | PARTITION RANGE ITERATOR | | 26026 | 1092K| | 2553 (1)| 00:00:31 | 1 | 5 |
|* 3 | TABLE ACCESS BY LOCAL INDEX ROWID| MB_SALES | 26026 | 1092K| | 2553 (1)| 00:00:31 | 1 | 5 |
|* 4 | DOMAIN INDEX | SIDX | | | | | | | |
——————————————————–

Predicate Information (identified by operation id):
——————————————————–

3 – filter(“S_DATE”<=TO_DATE(‘ 2010-05-31 00:00:00′, ‘syyyy-mm-dd hh24:mi:ss’) AND “S_DATE”>=TO_DATE(‘
2010-01-01 00:00:00′, ‘syyyy-mm-dd hh24:mi:ss’))
4 – access(“LUCENE”.”SCONTAINS”(“T_NAME”,’rownum:[1 TO 10] AND love’,1)>0)

18 rows selected.

Elapsed: 00:00:01.80

Note that columns Pstart and Pstop shows that the RDBMS optimizer knows that only partitions 1 to 5 of the index need to be visited to satisfy the query with the condition of  01-JAN-2010 and 31-MAY-2010, this functionality is known as partition pruning and is automatically exposed to OLS.

The advantages of partitioning/sharding are not only the possibility to transparently distribute searches across multiple processes, but also the index structure is smaller and as consequence of that less RAM is used during index operations and less time is involved doing maintenance operations such as optimization or export/import.

For example MB_SALES table has 167M rows, but each partition has only 6.9M rows. Finally, the index structure is only using 1.5Gb of space on each partition.

 Note: Using inline OLS pagination with partitioned tables caused that the query returns more rows than expected, it means for example if we defined rownum:[1 TO 10] instead of 10 rows expected above query returns 50 rows (10 by 5 partitions visited) and the sscore(1) sort should be re-executed at SQL level; if we need only 10 rows the solution is to use the Top-N syntax or limiting the cursor result at client-side.

NoSQL databases: What you should know about

Relational database model has been co-existing with us for around a quarter of a century -so much time, right?-, but a new class of database has emerged in the enterprise. I’m talking about NoSQL.

 

What is NoSQL?

NoSQL, also known as “non-relational” or “cloud”, is a broad class of database management system with significant differences from a classic relational database management system (RDBMS). The stored data not require fixed table schemas. It usually avoids join operations and typically scale horizontally.

 

Architecture with NoSQL

NoSQL database is characterized by a move away from the complexity of SQL based servers. The logic of validation, access control, mapping querieable indexed data, correlating related data, conflic resolution, maintaining integrity constraints and triggered procedures is moved out of the database layer. This enables NoSQL databases engines to focus on exceptional performance and scalability.
A key concept of NoSQL systems is to have DBs focus on the task of high-performance scalable data storage, and provide low-level access to data management layer.
Pros & Cons
Pros
  • Improve performance – Performance metrics have shown significant improvements vs relational access. For example, this performance metric compares MySQL vs Cassandra:
    Facebook Search
    MySQL > 50 GB Data
    – Writes Average: ~300 ms
    – Reads Average: ~350 ms
    Rewritten with Cassandra > 50 GB Data
    – Writes Average: 0.12 ms
    – Reads Average: 15 ms
  • Scaling – NoSQL databases are designed to expand transparently and they’re usually designed with low-cost commodity hardware in mind.
  • Big data handling – Over the last decade, the volumes of data has been increased massively. NoSQL systems can handle big data in the same way as the biggest RDBMS.
  • Less DBA time – NoSQL databases are generally designed to requiere less management: automatic repair, data distribution and simpler data models.
  • Reduce costs – RDBMS uses expensive proprietary servers and storage systems, while NoSQL databases user clusters of cheap commodity servers. So, the cost per gigabyte or transaction/second for NoSQL can be many time less.
  • Flexible data models – NoSQL key-value stores and document databases allow the application to store virtually any structure it wants in a data element.

Cons

  • Maturity – RDBMS systems are stable and richly functional. But most NoSQL alternatives are in pre-production versions with many key features yet to be implemented.
  • Support – Most NoSQL systems are open source projects. There’re a couple of small companies offering support for each NoSQL database.
  • Analytics & BI – NoSQL databases do not offer facilities for ad-hoc query and analysis. Commonly used business intelligence tools do not provide connectivity to NoSQL systems.
  • Administration – Although the design goal of NoSQL system is to provide a zero-admin solution, it’s true that requires a lot of skill to install and a lot of effort to maintain.
  • Expertise – As NoSQL systems is a new paradigm, all developers are in a learning mode.

Use a RDBMS or a NoSQL Database?

Depends mainly on what you’re trying to achieve. It’s certainly mature enough to use. But few applications really need to scale that massively. For most, a traditional RDBMS is sufficient. However, with internet usage becoming more ubiquitous all the time, it’s quite likely that applications that do will become more common (though probably not dominant).

NoSQL Implementations

There are currently more than 122 NoSQL databases. They can be categorized by their manner of implementation:
  • Wide column store / column families – Cassandra, Hadoop, Hypertable, Cloudata, Cloudera, Amazon SimpleDB, SciDB
  • Document store – MongoDB, CouchDB, Terrastore, ThruDB, OrientDB, RavenDB, Citrusleaf, SisoDB
  • Key Value / Tuple store – Azure Table Storage, MEMBASE, Riak, Redis, Chordless, GenieDB, Scalaris, BerkeleyDB, MemcacheDB.
  • Eventually consistent Key Value store – Amazon Dynamo, Voldemort, Dynomite, KAI
  • Graph databases – Neo4J, Infinite Graph, Sones, InfoGrid, HyperGraphDB, Trinity, etc
  • And others, and others…

Early Adopters of NoSQL

Social media corporations as the primary traiblazers of NoSQL implementations. The list includes:
  • Facebook
  • Twitter
  • MySpace
  • Google (Hadoop, Google App Engine)
  • Amazon (Dynamo)

Books & Papers Recomended

  • Professional NoSQL (Wiley/Wrox. 2011)
  • NoSQL Database Technology (CouchBase. 2011)
  • NoSQL Handbook (Mathias Meyer)
  • No Relation: The Mixed Blessings of Non-Relational Databases (Ian Thomas Varley. 2009)
  • Cassandra: The definitive guide (Even Hewitt. 2010)
  • CouchDB: The Definitive Guide: Time to Relax (J. Chris Anderson, Jan Lehnardt, Noah Slater. 2010)
  • Hadoop in Action (Chuck Lam. 2010)
  • HBase: The Definitive Guide (Lars George. 2011)
  • MongoDB in Action (Kyle Banker. 2010)
  • Beginning SimpleDB (Kevin Marshall, Tyler Freeling. 2009)

Conclusions

NoSQL databases solve problems which born with the global digital data growth, where the DBAs have been dealing with the well-known RDBMS.
Outside of scalability, it really seems that NoSQL databases do not have a killer feature.
Although, I think this is a new opportunity to become a professional on this paradigm.


New Oracle Lucene Domain Index release based on Lucene 3.0.2

Just a few words to announce a new release of Oracle Lucene Domain Index, this zip is valid for 10g and 11g database version (10g using back-ported classes from 1.5 to 1.4)
This release is compiled using Lucene 3.0.2 version and incorporates a set of new features added, here the list:

  • Added a long awaited functionality, a parallel/shared/slave search process used during a start-fetch-close and CountHits function
  • Added lfreqterms ancillary operator returning the freq terms array of rows visited
  • Added lsimilarity ancillary operator returning a computed Levenshtein distance of the row visited
  • Added a ldidyoumean pipeline table function using DidYouMean.indexDictionary storage
  • Added test using SQLUnit

The bigger addition is the Parallel-Shared-Slave search process, this architectural change was in my to-do list for a long time and finally I added in this release :)
The idea behind this is to have a new Oracle process started by the DBMS_SCHEDULER sub-system during the database startup process and stopped immediately before shutdown.
Now this process is responsible for implementing the ODCI methods start-fetch-close/count-hit on behalf of the client process (process associated to an specific user session) which connect to the shared-slave process by using RMI.
With this new architecture we have two principal benefits:

  • Reduce memory consumption
  • Increase Lucene Cache Hits

Less memory consumption because the internal OJVM implementation is attached to a client session, so the Java space used by Lucene structures is isolated and independent from another concurrent session, now allLucene memory structures used during index scan process are created in a shared process and then not replicated.
Also if one session submits a Lucene search, this search is cached for subsequent queries, all subsequent queries coming from the same client session or any other which are associated to the same index and with the same Query string implies a hit.
I’ll explain more in detail this new architecture in another post also showing how many parallel process can work together when using Parallel Indexing and Searching.
On the other hand next week I’ll be at the Oracle OpenWorld 2010 in SFO presenting the session:

Schedule: Tuesday: 09:30AM
Session ID: S315660
Title: Database Applications Lifecycle Management
Event: JavaOne and Oracle Develop
Stream(s): ORACLE DEVELOP
Track(s): Database Development
Abstract: Complex applications, such as Java running inside the database, require an application lifecycle management to develop and delivery good code. This session will cover some best practices, tools, and experience managing and delivering code for running inside the database, including tools for debugging, automatic test, packaging, deployment, and release management. Some of the tools presented will include Apache Maven, JUnit, log4j, Oracle JDeveloper, and others integrated into the Oracle Java Virtual Machine (JVM) development environment.

See you there or at any of networking planned events :)

Dealing with JDK1.5 libraries on Oracle 10g

Modern libraries are compiled with JDK 1.5 and the question is How to deal with these libraries on an Oracle 10g OJVM.
Some examples are Lucene 3.x branch or Hadoop. The solution that I tested is using a Java Retro Translatorand some complementary libraries.
I have tested this solution in Lucene Domain Index 3.x branch with success.
As you can see on the CVS there is build.xml file which performs all the retro translator steps. Here an step by step explanation of the process:

  1. Load all required libraries provided by Retro translator project which implements features not available on JDK 1.4/1.3 runtime, this is done on the target load-retrotranslator-sys-code. This target loads many libraries on SYS schema due are immutable, or with low probability of change. It will change if we upgrade a retro-translator version. All libraries are then compiled to assembler using NCOMP utility, target ncomp-runtime-retrotranslator-sys-code.
  2. Then we can convert libraries compiled with JDK1.5, in this build.xml file the Lucene and Lucene Domain Index implementation, to a JDK1.4 target runtime. This is done on the targets backport-code-lucene andbackport-code-odi, on first target We converts all Lucene libraries excluding JUnit and Test code, these libraries require as a dependency JUnit and retro-translator jars. Second target converts Lucene Domain Index jar depending on Lucene core and Oracle’s libs. The back-port operation generates a file namedlucene-odi-all-${version}.jar with Lucene and Lucene Domain Index code ready to run on JDK1.4 runtime.
  3. Once We have the code back-ported to a JDK1.4 runtime We can upload and NCOMP into Oracle 10g, this is done on targets load-lucene-odi-backported-code and ncomp-lucene-all.

And that’s all!!, the code works fine on my Oracle 10.2 database – Linux :) , finally users of 11g and 10g databases can deploy Lucene Domain Index implementation using one distribution file.