Mark Logic Server: Transactions in Mark Logic

Hello Friends, I am back with another topic in Mark Logic. But this time we are going to discuss conceptual/theoretical details because it is very important and deserves a discussion before jumping to any practical implementation/discussion.

Transaction is a very important feature of any database which ensures data integrity and responsible to maintain accuracy and consistency of data.

Mark Logic also supports Transactions like other database systems but terms used may be bit different just like others because each database system has its own terms. Below are the terminologies used in Mark Logic for Transactions.

1.Statements:- Statements are piece of xQuery that communicates with data as saved in Mark Logic database. Statements can be read/query statements or modify/update statements.
2.Query Statements:-Query Statements are the statements which are used to just fetch data from Mark Logic database and does not includes update call. So, after Query statement execution there is no change in state of data in database. Query statements have read only view of database and therefore no lock or lightweight locks are applied during execution of Query Statement to enhance the performance.
3.Update Statements :- Update Statements are the statement that performs or may perform modification in state of data or data itself. A statement can be categorised as update statement whether it is performing or not performing update at run time. Update statements applies reader/writer locks as or when needed during execution.
4.Transaction :- A set of one or more statements which either succeeds all or fails all. Transaction may be an update or query transaction on the basis of statements used and/or transaction mode used. Apart from their nature Transaction can be single statement or multiple statement transaction on the basis of number of statement involved.
5.Query Transaction:- Query transactions are the transactions which makes no modification in data and applies no locks on data. Query transaction may also be single statement or multiple statement on the basis of applied transaction mode.
6.Update Transaction:- Update transactions are the transactions which included update statements and applies readers/writers lock on data. Just like query transaction, update transaction may also contain single or multiple statement.
7.Single-Statement Transaction:- Single statement transactions are the transactions which are automatically committed after successful execution of each statement or rolled back on error. A transaction created with default transaction mode (i.e. auto) is always a single statement transaction.
8.Multi-Statement Transaction:- A transaction is multi-statement transaction if consists of one or more statements which commits or rollbacks together. Multi-statement transactions can be created with “query”or “update”transaction mode. Multiple statement transactions must be committed explicitly using xdmp:commit. In multiple statement transaction, changes made by previous statement are only available for next statements within the same transaction until xdmp:commit called/executed.
9.Transaction Mode :- Transaction mode in Mark Logic is used to specify if specified transaction need to be considered as Query or Update transaction. Also, transaction mode help to specify commit strategy. Mark logic supports three possible values for transaction mode (I.e. auto, query, update). Transaction created with “auto”mode are single statement transactions (as explained above) and automatically commits/rollbacks on success/error respectively.
10.Commit:- Commit is the state where transaction ends and makes all changes available for database as made by statements in transactions. Xdmp:commit is used to commit transaction explicitly.
11.Rollback:- This terminates the transaction and discards all the changes made by statements in transaction. On error all transactions are rolled back automatically. Transaction is also rolled back if timeout occurs before reaching to xdmp:commit.
12.Readers/writers lock:- A set of locks applies on documents for read or write during access as per the transaction. For example update transaction always looks latest version of document and locks document for any update during access. Once a document is locked, any other update statement will wait for lock to be released before updating.

As specified, above are the various terms and their definitions which are used with the concept of transaction but another important point is how to control transactions as per choice or requirements. I will try to quickly explain about how to control transaction as per choice.

As we discussed that transaction mode of statement is automatically detected as per static analysis of statements when we are not specifying transaction mode explicitly. So, in that case there is many time condition arises when a document is locked with no reason due to update transaction mode defined automatically while actually doing just reading. Because this lock was applied due to update transaction mode selected automatically, this lock will be maintained till transaction completed. Now assume a case if another transaction just trying to update that document which is locked with readers lock unknowingly that will stop other transaction to update that doc and the transaction need to wait unnecessarily till the completion of first transaction, which has locked the document. And this can result in performance issue. So in this case better option is to read document with query mode transaction which applies no lock and document is available for updates for further transactions.
Now lets consider this example in real time scenario. Consider there is document D1, which has frequently used information for a long running process. Also D1 contains statistic of progress of process completion in percentage which need to be updated every time a piece of task is completed. There is a transaction T1 which is reading required information from D1 and at the same time another Transaction T2 is updating progress statistic in D1. In this scenario if T1 initiated with wrong transaction mode which is “update” then D1 will be locked with readers lock. Now if in transaction T1 update statement occurred than readers lock on D1 will convert to readers/writers lock and this lock will be maintained until T1 is completed or committed explicitly. And at the same time T2 is trying to update statistic in D1 which need to wait until T1 completed. So, this will slowdown the process and also stop T2 to proceed further to start further processes.

In most of the cases default transaction mode (I.e auto) is used by the application and all the transactions are single statement transactions which are committed automatically with every statement execution completed. But if you need multiple statement transaction in any case then you need to specify transaction mode explicitly either as query or update. Below are the methods to explicitly set the transaction mode.

1.Declare xdmp:transaction-mode option in prolog (in top) of your program.
2.Call xdmp:set-transaction-mode prior to creating transaction that should run in that mode.
3.Set the transaction-mode option in the options node passed to xdmp:eval, xdmp:invoke, or xdmp:spawn.

Changing transaction mode in middle of current transaction does not affects current transaction.
Below are the some option to execute some statements with different transaction mode other than default selected transaction mode.
1.xdmp:eval (Refer https://docs.marklogic.com/xdmp:eval)
2.xdmp:invoke

Above options allows to create transaction in different session and so allows to run statements with different transaction mode other than default or current statement transaction. Script or statements executed by eval/invoke, can be handled to execute in separate or same session or transaction mode as in calling statements. This is achieved by isolation option to xdmp:eval/invoke. Below are the allowed isolation options.

1.same-statement :- this option allows to run statement executed by xdmp:eval/invoke in same session and with same transaction mode as is applied for calling statement. Any update done with this isolation using xdmp:eval/invoke is not available for subsequent statements of calling statements but if statements are in multiple statements then updates are available for subsequent statement of calling statement.
2.different-transaction :- This isolation is used to create separate session for execution of statements in xdmp:eval/invoke and hence allows to execute statements with different transaction mode other than calling statement.

Okay friends, Now keep exploring more and proceed with the technical implementation of transactions and share your views and findings.

References
https://docs.marklogic.com/guide/app-dev/transactions

Mark Logic Server

Monday, 28 November 2016

Transactions in Mark Logic

No comments:

Post a Comment