Mark Logic Server: 2015

Monday, 30 November 2015

Information Studio Flows

Hey Friends,

Here I am back to discuss about another stuff of Mark Logic. Its Information Studio Flow.

Information Studio

Information Studio is XQuery API which has browser based interface and a part of Mark Logic application services suite. Due to browser based interface it is easy to understand and use. This API enables you to create database and to load them with content. It actually provides you tools to perform such operation to load data in Mark Logic.

Flow

Flow is one of its tool that is very easy to understand and handle and which helps to perform loading of data in mark logic with some processing/transformation of documents and data.

Flow is a content load configuration which describes the document to be loaded in database and specifies how to load them in database.

In my words Flow is that mechanism which can create a door for you to directly transfer your content/xml in Mark Logic after required transformation applied and good part is that you don’t need to be mark logic programmer to use it.

Suppose you are very new to Mark Logic but want to start by creating some application to display meaning full and good amount of data and not ready to use MLCP and other content loading mechanism. In that case Flow could be of great help to you.

You just need create database and configure Flow to upload your contents to database directly through it.

Let’s discuss how you can create a flow and what are part of its configuration.

Flows can be accessed through 8000 port of Mark Logic through below URL

http://localhost:8000/appservices/

(please replace “localhost”ip address of server/machine installed with Mark Logic server)

Here you can see existing flows and can create a new flow. Click on “New Flow” button will navigate you to the new flow screen created with name “Untitled-[Number]” along with option to edit that name.

Flows are consist of three part configurations.

Collector

Collector configuration specifies that where to get contents to load in database and how to ingest them. This configuration specified on screen with “Collect” name. Default collector is File system directory which can be changed to other option too like drop box to upload content using browser. Collect section helps to configure about where and how to collect content and provide option for the configuration as below.

1. Configure: this is responsible to configure location to collect content to load in database. Here we can mention path of directory at server which will work as door to send content data to Mark Logic

2. Ingestion:- Ingestion settings are responsible to decide which kind of document should be loaded and how many at a time etc. This also provides option to filter documents via regular expression to avoid any useless content loading. This also provides option to repair XML while loading in Mark Logic and set a default namespace of documents. You can ignore any modification in this section if don’t need to make such modification in contents.

Transform

This section provides the option to create transformation steps for the content being load in Mark Logic. You can use following type of transformations to transform your documents.

1. Delete:- This helps to remove unwanted element/attribute/information from content documents

2. JSON Transform:- This will convert your XML document in JSON format.

3. Rename:- This transformation helps to rename element/attribute in content documents.

4. XQuery transform:- This is custom transformation where you can write xquery base logics using CPF to apply your own rules to decide document to transform and what should be the transformation.

5. Filter Document:- This transformation is actually responsible to extract metadata information from binary documents and can save that information in properties of document for easy use and filtering.

6. Normalize Dates:- This transformation can be applied to keep normalize date formats in content documents to avoid any problem due to different date formats in different documents.
Schema Validation:- This transformation is used to validate content documents against specific predefined xml schema.

7. XSLT:- XSLT transformation can be used to apply custom XSLT stylesheet on contents in xml documents.

Load

This section is used to configure database to load with content and to define document properties in specified database.

Database can be selected in “Destination database”dropdown and Document settings is used to define URI structure of loaded documents in specified database. Permissions can also be defined for different users on destination location. Collection can be created for all loaded documents through flow to identify separately.

Finally you got a Start Loading button which will start looking in configured directory for documents to load in database and if found than it will process them through configuration and after transformation it will move them in database.

When you start loading through this button status would be displayed in the section for each load with loaded document and process status. Here it is provided with option to unload last uploaded document in database through “Unload” button.

So, we have discussed about how to create a flow to load content but you can see that the process to trigger loading of documents is manual. If we need this process automated than we need a small additional thing.

Just create a Xquery module file with following code in it and schedule it as scheduled task in Mark Logic server to run daily/minutely (etc. as per choice).

xquery version "1.0-ml";
import module namespace info = "http://marklogic.com/appservices/infostudio" at "/MarkLogic/appservices/infostudio/info.xqy";
let $flow-id := info:flow-id("[NAME OF YOUR FLOW]")
return
info:flow-start($flow-id)

Above code will trigger your flow to look in configured directory and load contents in database as per defined configurations in your flow.

I believe that this is quite enough for you guys to had a good start on Information Studio Flows to load content. Please keep me posted your suggestions/queries.

Tuesday, 8 September 2015

Create and manage CRON jobs

Hello Friends,

There are lots of things in technical world which are done with more complexity while simple thing attracts me. I always think about simplest way to implement anything until it is really need to be complex with specific and justifiable reasons.

Well, here is my new article with simple thing but not very easily available as per my requirement. So sharing with you guys if could be of any help for someone.

Introduction
“The software utility Cron is a time-based job scheduler in Unix-like computer operating systems. People who set up and maintain software environments use cron to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals.”

So, in simple words Cron is a utility/software/application in unix based OS, just like task scheduler in windows based operating system. Purpose of cron job is to provide facility of automated task processing on specific time intervals. Cron jobs are widely used to setup and configure to run shell script on periodic interval which in turn triggers other applications or process to accomplish specific automated process flow.

So that is about cron jobs. Now point is how to access and manage these jobs and answer is Crontab. Crontab is just like a file which contains configuration of all cron jobs scheduled.

The crontab is a list of commands that you want to run on a regular schedule, and also the name of the command used to manage that list. crontab stands for "cron table," because it uses the job scheduler.

Let’s come to the practical. As we know unix operating system is command based OS. Most of the work done in Unix is done through commands. In the same way there are some commands which are useful to view and manage cron jobs.

To connect to server to access and manage cron jobs we need an application which helps to execute commands. PUTTY console is one of the option that i am using.

PuTTY is an SSH and telnet client, developed originally by Simon Tatham for the Windows platform. PuTTY is open source software that is available with source code and is developed and supported by a group of volunteers.

You can download PuTTY here

Below are the required details needed to connect with Putty console.

Server IP - 10.xx.xx.xx
Username - admin
Password - admin

Once connected to server through putty console we can access Crontab to view and add/edit Cron jobs through commands.

Below are the commands which are used to view and manage cron jobs in crontab

Crontab -l :- This command is used to view list of all cron jobs created and scheduled.

Crontab -e :- This command is used to edit crontab to add or edit existing cron job configuration.

Crontab -r :- This command is used to remove crontab and to delete configuration of all scheduled cron jobs.

There are few more commands available but above three are sufficient for now.

Now next step is to know about how to schedule a cron job to run on a specific interval.

When you run “crontab -e”command the crontab would be opened in edit mode but you need to press “Insert” key to make any modification in it.

Below are the key points to know while scheduling cron job.

1. Cron jobs are schedules in following fixed pattern
[MOD] [HOD] [DOM] [MON] [DOW] [COMMAND]
MOD - Minute of day - possible values : 0 - 59
HOD - Hour of day - possible values : 0 - 23
DOM - Day of month - possible values : 1 - 31
MON - month - possible values : 1 - 12
DOW - Day of week - possible values : 0 - 6 (from Sun to Sat)
COMMAND - Path of shell script (.sh file) which need to be triggered on specified schedule
For ex.

0 16 * * 0 /home/task.sh

Above configuration of cron job is scheduled to run shell script (task.sh) on 0th minute of 16th hour of every day of every month when day of week is Sunday.

Please note that -
0 - indicates first instance only
* - indicates every instance

2. Press “Esc” key to exit from edit mode of crontab
3. Then type “:wq” and hit enter key to save changes in cron tab.

This will display that new cron tab is installing. You can verify you changes later through Crontab -l command.

So that’s all about crontab to create and manage cron jobs in short but you can further explore topic for in-depth details.

Wednesday, 26 August 2015

Parallel Process Execution

Hey Guys,

I am back with something to share with you, with the thought that it might be helpful for someone.

Recently i was struggling with a problem to accomplish a task where I need to stop a process running on MarkLogic from web application where process was started from same web application.

At that time i realize that whatever requests made on mark logic through specific port (on which web application is running), are being queued for execution and synchronous execution is being applied on them. Thus if a process started through a port and then sent another request to stop that process from same port. In that case the second request (stop request) will execute only after completion of previous requests for same port. So every time my stop command executes after process completion that is of no use.

Then started to explore and learn more about parallel process execution in mark logic and came to know that we can run process asynchronously in separate thread so that while a long running process is in progress no other request would be waiting for execution.

xdmp:spawn is a library function which is used to start a process/module execution in separate thread asynchronously. The task/process/module executed through xdmp:spawn function is executes separately while other process are allowed to be processed/executed at same time.

Reason behind that how it allows to execute task separately is that it creates a separate task (a mark logic task) and adds in task queue for execution.

Below is the syntax describing how to use this function.

xdmp:spawn($task-path, (),
<options xmlns="xdmp:eval">
<result>{fn:false()}</result>
</options>)


xdmp:spawn takes three arguments in same order as below
Task-path :- It is the path of a module (.xqy) file which contains code to execute
Variables :- This argument is a collection of variable/values that need to be passed to module.
Options :- This argument is used to set configuration for xdmp:spawn function to execute module/process accordingly. For example through we can specify that module is deployed in which module database or do we need to return result from the task etc.

For ex -

xdmp:spawn("module.xqy", (),
<options xmlns="xdmp:eval">
<modules>{xdmp:modules-database()}</modules>
<root>http://example.com/application/</root>
</options>)

One important point with the options for xdmp:spawn is that if we have configured options to send results back from xdmp:spawn function than it reacts just like a synchronous request execution because it would be waiting for response from function and keep waiting for further requests for execution

Benefits : -
Benefits of xdmp:spawn function is that process would be executing separately without blocking other requests from execution and suitable for long running tasks specially.

Limitations : -
Limitations of using xdmp:spawn function is that we have no control on the process like when process execution started and when completed. We need to keep watching for specific task in task server to know whether task is running or completed.

Needs an extra privilege to execute xdmp:spawn function as below
http://marklogic.com/xdmp/privileges/xdmp-spawn

A very important point to keep in mind while using this function is that once it is called it can not be rolled back whether the transaction is failed or not completed. Therefor be careful while using this function and suggested to avoid using it in modules/code where performing an update transaction.

So friends enjoy parallel execution of long processes in separate thread but please be really very careful.

Please refer below url for in depth detail on the topic.

https://docs.marklogic.com/xdmp:spawn

Thursday, 18 June 2015

MVC Application in XQuerrail Framework - Part - 2

Hey Friends,

I am back with part-2 of my post for MVC Application in XQuerrail framework.

Just for a quick recall, previously on my post - MVC Application in XQuerrail Framework - Part - 1, we discussed about how we can create an entity/screen with basic CRUD operations without any code for controller or views.

So today we will extend our application as developed in previous post to include simple customization of view, controller and model for specific requirement to demonstrate custom model, view and controller implementation in XQuerrail framework based MVC application.

As we already know that XQuerrail framework is an stand alone folder based framework and strict follow of folder structure is needed in XQuerrail framework base MVC application. So Let’s start with folder structure of XQuerrail framework based application for MVC.

As we discussed previously in XQuerrail Framework Based Application With Mark Logic that “main” folder is mainly containing all required code to run XQuerrail framework based applications and app” folder in “main” contains entire application specific code for an application

Below are the folders can be created in “app” folder if custom implementation of model, view or controller is required.

“/app/controller/” - All custom controller of application for any model must be placed in this folder with the naming convention as <controller_name_as_configured_in_application-domain.xml>-controller.xqy.
For ex. If we need to create custom controller for “person” then we should create a file in “/app/controller/” folder with name as “persons-controller.xqy” where “persons” is controller name as we defined in application-domain.xml for “person” model. (refer part 1 of this article)

“/app/models/” - All custom model of application, to communicate with database must be kept in this folder which are used to implement custom methods as not provided by framework by default or if we need to override definition of existing method of framework for specific model. Name of .xqy file in this folder should be exactly same as model name as specified in application-domain.xml.

“/app/views/” - All custom views specific to specific controller must be placed in this folder with in a folder named as controller name for model and file name should be as <controller_name_as_configured_in_application-domain.xml>.<view_name_to_customize>.html.xqy

For ex - If we want to have custom view (UI) for adding new “person” model record which is related to “persons” controller than “persons.new.html.xqy” should be kept in “/app/views/persons/” folder.
You can get base implementation of “new” view which is used by framework to create record of specific model from views folder in framework i.e. “/main/_framework/base/views/”. Copy “base.new.html.xqy” file from this folder to your custom view folder and rename this file as discussed above and make required customization in code.

That is enough for theory now lets create an example using model, view and controller.

Consider following scenario to achieve. We need to create screen where we need to create new record belonging to a role model. But we need additional information belongs to other model i.e. Pages which is saved in database. On the screen of create new role screen we need to display a list of all pages available for selection which should be assigned to specific role.

Now consider below definition for role
<model name="role" persistence="document" key="uuid" keyLabel="name" extends="base" >

<document root="roles">/lookup/roles.xml</document>

</model>

This will create by default name field only. Now we need additional page list on UI for while adding new role. If we need list of all pages in another XML document then we can proceed following steps to achieve above considered example.

Step1 :- Create a model in models folder for role and name as “role.xqy”
In this step we are adding custom model for role which can be utilize to get additional information or perform additional operation which are not performed by default model as created by framework. So here we can write method that interacts with database to get pages list.

Step2 :- Create a controller in controllers folder for role and name as “roles-controller.xqy”
Implement code to handle create new role request here so that it will provide additional pages list to render on uI while creating new role. Below is code snippet to give an idea to override new() request to get additional information of page list.

declare function controller:get-pages()
{
　　(: Function created in custom model too get page list from database :)
model:get-pages()
};

declare function controller:new()
{
let $body :=
element role {
attribute {"xsi:type"} {"role"},
attribute {"xmlns"} {"http://xquerrail.com/app"},
attribute {"xmlns:app"} {"http://xquerrail.com/app"},
element name {""},
element pages {
controller:get-pages()/page
}
}

return
(
response:set-template("main"),
response:set-title(controller:model()/@label),
response:set-view("new"),
response:set-body($body),
response:flush()
)};

Step3 :- Create a custom view to create new role in views folder and name as “roles.new.html.xqy”
Create or modify html of new role creation screen to include page list to display on UI as will be provided from controller through overridden new() request. Below is code part for modified new.html.xqy

<div class="span8">
<b>Name</b><br/>
<input type="text" name="name" ></input>
<br/>
<br/>
<b>Pages</b>
<ul>
{
for $itm in response:body()//*:page
return
<li><input type="checkbox" >{$itm}</input></li>
}
</ul>
</div>

So friends, hopefully you will be having a good idea about model, view, controller for XQuerrail framework and their linking to work together. But as i believe the best way is to exercise yourself. So, try to implement above explained example yourself and then move further for deep dive in.

Till the time keep exploring.

Wednesday, 27 May 2015

MVC Application in XQuerrail Framework - Part - 1

My Dear Friends,

In the continuation of my previous blog on XQuerrail Framework, Today we are going to discuss about about MVC in XQuerrail framework for development of Model, View and Controller for application development.

I hope you remember my previous post on this (XQuerrail) thread but if you missed then please go over it here

In previous post we learned about installation of prerequisites and configuration of application to run on XQuerrail framework in Mark Logic. Now let’s extend the same application to implement model, view and controller as per MVC framework in Xquerrail.

MVC in XQuerrail is similar to MVC pattern in any other language/framework
View - It is implementation of presentation layer which actually appears as UI for user interaction. XHTML is mostly used markup to design layout of view wile creating for XQuerrail framework based application.
Controller - It is responsible to handle requests from view and respond according to specific request for specific action. Controllers are written in XQuery for Xquerrail framework.
Model - It is responsible for data/table structure definition to store or share data in specific structure with specific number of fields/properties etc. It is also responsible for any operation to be performed on data as saved in database in specified model structure.

In XQuerrail framework model is implemented in domain using domain.xml which is facilitated by base model and controller layer of framework to support common operation like create, edit, delete, get, list etc.

So in case of normal operation we don’t need to write any code in model layer and basic operation requests are handled by framework model but if in any case if we need any custom implementation of any existing or new operation/action than we can achieve that by writing associated model layer which is responsible to get, create and manipulate data in database.

Now let’s come to the practical directly.

To create any model you need to navigate to “/app/domain/application-domain.xml” where you can observe already available some the model definition. Now create your own model for “Person” by copying below code in application-domain.xml withing “domain” block.

<directory>/data-sources/persons/</directory>

</model>

Below are the purpose of each element as specified in above definition of model

Name - It is the name of model which should be unique for identification of model.

Persistence - This property indicates about how the model will persist in database. Below are some of the possible values for this.

Abstract - if a model is created with “abstract”persistence that means model is not going to save but will be used as class to inherit common properties

Document - If a model is created with document” persistence that means the model is going to be converted in a xml with associated data to be saved in database where xml will be containing multiple records of specified model definition.

Directory - If a model is created with “directory” persistence that means the model is going to convert in xml with associated data where each xml will be containing single record of specified model definition.

Key - This property specifies unique key for each record belonging to specified model definition.

KeyLabel - It defines label to be displayed for unique key field

Extends - This property used to provide reference of another abstract model to inherit common properties.

Directory - this element specifies the location of directory in mark logic database which is used to preserve/save xml for each record of specified model

Navigation - This element is used to enable/disable basic operations of model as provided by framework by default additionally with custom operation (if any). Below are some of them as provided by framework.

Newable - Set this property to true/false if you want/don’t want to allow creation of record for specified model

Editable - Set this property to true/false if you want/don’t want to allow editing of record for specified model

Removable - Set this property to true/false if you want/don’t want to allow deletion of record for specified model

Listable - Set this property to true/false if you want/don’t want to allow listing of records for specified model

Element - This element is used to define data points to be saved in document. It also contains type of information with their label to be appear on UI.

So for example as shown in code snippet , we have defined a person model with associated information to save in Mark Logic.

Now lets define controller for this model to respond requests. Add below code in application-domain.xml at last within domain tag.

Here we have provided required information to framework for controller. Controller name is “persons”and related mode “person” display label as “Persons” if shown in UI and class belongs to base that means default controller operations are going to be handled by base framework, whether we write additional controller and model or not.

Now initialize the application by navigating to initialization url of your configured application.
For ex. http://localhost:port/initialize.xqy

So as we have already added model and controller configuration for person class we can navigate to below URL to view default index page implementation from framework.

http://localhost:port/persons/index.html

Now lets try some modifications to model definition in navigation element to to enable/disable common operation which are going to be handled by framework by default where we don’t need to write any code for CRUD (CReate, Update,, Delete) operations

As of now as you can see there is option to add new person record, edit and delete on index.html page. Now suppose we don’t need to provide editing and removing feature for person model. In that case just set following properties to false and you will observe that feature of editing and removal of person record is dis-allowed from the UI.

Isn’t it quite simple and cool...

Well, So far we saw about the normal CRUD operation and learned about how we can configure and achieve that in XQuerrail framework through model creation and configuration without writing any line of code for controller. I hope you observed that we have not written any view as well to display desired UI with data.

We have just created model definition and rest of implementation of controller with default operations and view is automatically created by XQuerrail framework.

But there are many condition where default controller or view implementation by framework is not sufficient and we need additional methods or UI modifications to achieve desired feature and functionality of application. So that’s the point where the custom model, controller and/or view may need to create as per specification of requirement.

As of now enough for today and just explore and try creation of screens with various possible options/configurations. We will meet soon with next part of this article where we would learn to create custom model, controller and view as per specific requirement.

Till the time keep enjoy learning and exploring.