Mark logic is the new generation database for Big Data. Mark
logic provides the trusted platform for Big Data Application and helps to
search crucial information from large data as available in disconnected and non-relational
way and provides the way to convert it in useful data in fast way to generate
revenue. Mark Logic is based on NoSql Database technology.
Mark
Logic is based on XMLs for saving data contents. So, it will help if you have
some idea of XML and related technologies. For ex – Xpath, XQuery, XSLT etc.
Mark Logic is very powerful tool to search large data contents and process them
in very fast way to get meaningful results for analysis and decision making
etc.
The
best way to understand Mark Logic is to download its developer version and
start playing it for more details. But for that you might need to know some of
the basic theory about Mark Logic Database.
So today we are going to discuss very basic level theory
about terms used in Mark Logic as below.
1.
Hosts
2.
Database
3.
Forest
4.
App Servers
5.
Modules
Hosts:-
A host is an instance of Mark Logic server running on a
single machine. Sometime the machine installed with instance of Mark Logic also
pronounced as Host. Host is always a part of a group that means a host can’t be
created and configured individually. By default a host is added to default group.
Now you must be thinking that what is group? But
as of now just start with that every instance of Mark Logic has a default group
named as “Default”. I will cover Groups and clusters in the advance topics in
Mark Logic in my later blogs. But initially we can start with default group and
cluster as created as default with Mark Logic instance. Database:-
In Mark Logic, Database is a layer which actually doesn’t stores
contents directly. Database serves as a layer of abstraction between forests
and servers (HTTP, XDBC, WebDav) too access contents as saved in Mark Logic
forests. A database is consists of single or multiple forests which are
configured on host and forests are actually containing data which is saved in
Mark Logic database. Mark Logic database provides a single point of access and
contiguous set of contents to connect, query or operate on data as saved in
multiple forests.
Mark
Logic is installed with following supporting databases as default.
a)
Documents – This database contains default
properties and size etc. information of documents as in Mark Logic
b)
Last Login – This database contains and tracks last
login information in server and other accessibility to database
c)
Schemas – This database contains schema information
of every database. Each data base is connected to Schema database as default to
save schema information however it could be saved in same database as well but
it is recommended to keep it in default schema database.
d)
Security – This database contains security
related configuration information of every database. Every database is
connected to security database to save security information and is recommended
to save security information in default security database.
e)
Modules – This database is used to store
executable XQuery code. This database is created by default during Mark Logic
installation which we can use to keep our executable XQuery but we can also
save XQuery in other database but that data base should be used as module database
in HTTP or XDBC server configuration.
If we are using Modules
database to keep XQuery files than each XQuery file must be prefixed with root url (as configured in HTTP or XDBC
server as root ) to access XQuery file as saved in associated Modules database.
For example, if you are using a
modules database and specify a root in an HTTP or XDBC server of
http://marklogic.com/, the following documents are executable from that server:
http://marklogic.com/default.xqy
http://marklogic.com/myXQueryFiles/search_db.xqy
f)
Triggers - Trigger database is used to store triggers.
Triggers are nothing but some executables to process contents. Default trigger
database is created during installation to store triggers however separate
database can also be used as trigger database just like as Modules database.
Forests:-
Forests are the actual storage of contents. Forest contains data
in the form of XML, text or binary documents. Forests are created on hosts and
attached to a database. One database can be attached to multiple forests but one
forest is attached to only one database at a time. Multiple forests attached to
a database appears as a contiguous set of content for database for query purpose.
However individual forest (not attached with any database) is of no use. No
data can be loaded/saved in a forest which is not attached with any database.
A Forest contains in memory and on disk structure is called
as stand.
App Servers:-
App servers are accessible and created/managed at group
level in Mark Logic Server. Each App server could be associated with one
database and configured to single port for communication. App servers are
actually used to access data as saved in Mark Logic database forests from applications.
Applications communicates with these app servers to fetch or
insert/search/delete documents. There are following App Servers available in
Mark Logic which has their own specific purpose and limitation.
a)
HTTP Server
b)
XDBC Server
c)
WebDav Server
d)
ODBC Server
HTTP Server: -
HTTP Servers in Mark Logic enables to create XQuery based
web application. Using HTTP server we can execute an XQuery from web
application against a database to fetch and process data in documents. HTTP Server
enables us to return XHTML or XML contents to browser or other HTTP enabled
client applications.
HTTP Servers are defined at group level and accessible to
all hosts in a group. HTTP server provides access to set of XQuery programs
which are saved in specific directory structure. HTTP servers are connected to
a database on specific port and executes all respective XQuery executables
against associated database using HTTP server.
HTTP server can execute XQuery code either from a specified location
in file directory or from Modules database.
Click Here to
see procedures to create and manage HTTP server.
XDBC Server: -
XDBC App Servers are used for XML Contentbase Connector
(XCC) applications to communicate with Mark Logic Server. XCC is an API which
is used by Java and .NET to communicate with Mark Logic server. XDBC server are
defined at group level and accessible to all hosts in a group. XDBC server
provides access to a specific forest and to root to access set of XQuery programs
that resides with in a specific directory structure.
These XDBC servers are used to insert/fetch/delete documents
from Mark Logic using .Net or Java application. XDBC servers also used to
access XQuery programs or library within query console of Mark Logic server.
XDBC server provides access to a specific database/forest
but using XCC connector we can communicate to any database of host with in a cluster.
Click Here to
see procedures to create and manage XDBC server.
WebDav Server: -
WebDav servers are used to access database documents and
programs directly in file system using WebDav client. It allows to
read/write/delete documents directly from database on the basis of configured
security settings. WebDavs are needed when we need to store and access our
XQuery base programs in a database using specific directory in that database.
WebDav servers in Mark Logic are similar to HTTP servers,
but has the following important differences-
i)
WebDAV servers cannot execute XQuery code.
ii)
WebDAV servers support the WebDAV protocol to
allow WebDAV clients to have read and write access (depending on the security
configuration) to a database.
iii)
A WebDAV server only accesses documents and
directories in a database; it does not access the file system directly.
WebDAV (Web-based Distributed Authoring and Versioning) is a
protocol that extends the HTTP protocol to provide the ability to write
documents through these HTTP extensions. You need a WebDAV client to write
documents, but you can still read them through HTTP (through a web browser, for
example).
Click Here for information on creating and configuring a WebDav Server,
ODBC Server: -
ODBC server is used to allow SQL client to communicate with
Mark Logic server for database operations using SQL statements. ODBC is one of
several component in Mark Logic that supports SQL queries. Basic purpose of
ODBC server is to return relational style data as in Mark Logic, in response of
SQL queries. The ODBC server returns data in tuple form and manages server
state to support a subset of SQL and ODBC statements. DBC servers are created
and managed at Group level and ODBC server associated with a specific database.
Click
Here for information on creating and configuring a WebDav Server,
Modules:-
Set of XQuery base programs or executables are called as
modules which are saved with .XQY extension. These modules are nothing but set
of XQuery statements to fetch or process data as saved in Mark Logic. But the
XQuery program will be executed on which database this is configured using
Modules setting in App Servers. If App server is configured with file system in
Modules setting then XQuery programs are stored in that specific directory as
specified in root of App Server. If Module setting in App server is configured
to some database (for ex Modules database as created default for same purpose)
and we want to store our XQuery programs in that database in that case we need
to create WebDav server for the configured database (i.e. Modules) so that we
can access directory structure of database and could store our program in
specific directory and can access with root URL by prefixing in XQY file location where root URL of App server should be
top level directory URL.
So friends, I think we talked enough theory to start playing
with Mark Logic Server using these basic theory concepts. In next blog we will
go for practical implementation of these. They might need separate blog for
each but you can explore it more at your own as well using Mark Logic Admin
guide.
Reference
No comments:
Post a Comment