Mark Logic Server: Parallel Process Execution

Hey Guys,

I am back with something to share with you, with the thought that it might be helpful for someone.

Recently i was struggling with a problem to accomplish a task where I need to stop a process running on MarkLogic from web application where process was started from same web application.

At that time i realize that whatever requests made on mark logic through specific port (on which web application is running), are being queued for execution and synchronous execution is being applied on them. Thus if a process started through a port and then sent another request to stop that process from same port. In that case the second request (stop request) will execute only after completion of previous requests for same port. So every time my stop command executes after process completion that is of no use.

Then started to explore and learn more about parallel process execution in mark logic and came to know that we can run process asynchronously in separate thread so that while a long running process is in progress no other request would be waiting for execution.

xdmp:spawn is a library function which is used to start a process/module execution in separate thread asynchronously. The task/process/module executed through xdmp:spawn function is executes separately while other process are allowed to be processed/executed at same time.

Reason behind that how it allows to execute task separately is that it creates a separate task (a mark logic task) and adds in task queue for execution.

Below is the syntax describing how to use this function.

xdmp:spawn($task-path, (),
<options xmlns="xdmp:eval">
<result>{fn:false()}</result>
</options>)


xdmp:spawn takes three arguments in same order as below
Task-path :- It is the path of a module (.xqy) file which contains code to execute
Variables :- This argument is a collection of variable/values that need to be passed to module.
Options :- This argument is used to set configuration for xdmp:spawn function to execute module/process accordingly. For example through we can specify that module is deployed in which module database or do we need to return result from the task etc.

For ex -

xdmp:spawn("module.xqy", (),
<options xmlns="xdmp:eval">
<modules>{xdmp:modules-database()}</modules>
<root>http://example.com/application/</root>
</options>)

One important point with the options for xdmp:spawn is that if we have configured options to send results back from xdmp:spawn function than it reacts just like a synchronous request execution because it would be waiting for response from function and keep waiting for further requests for execution

Benefits : -
Benefits of xdmp:spawn function is that process would be executing separately without blocking other requests from execution and suitable for long running tasks specially.

Limitations : -
Limitations of using xdmp:spawn function is that we have no control on the process like when process execution started and when completed. We need to keep watching for specific task in task server to know whether task is running or completed.

Needs an extra privilege to execute xdmp:spawn function as below
http://marklogic.com/xdmp/privileges/xdmp-spawn

A very important point to keep in mind while using this function is that once it is called it can not be rolled back whether the transaction is failed or not completed. Therefor be careful while using this function and suggested to avoid using it in modules/code where performing an update transaction.

So friends enjoy parallel execution of long processes in separate thread but please be really very careful.

Please refer below url for in depth detail on the topic.

https://docs.marklogic.com/xdmp:spawn

Mark Logic Server

Wednesday, 26 August 2015

Parallel Process Execution

1 comment: