Node.js: A simple pattern to increase perceived performance

August 9, 2016, 7:22 am

≫ Next: Node.js: My first SOAP service

≪ Previous: Benefits of a Canonical Data Model (CDM) in a SOA environment

Share this on ..

The asynchronous nature of code running on Node.js provides many interesting options for service orchestration. In this example I will call two translation services (Google and SYSTRAN). I will call both of them quickly after each other (milliseconds). The first answer to be returned, will be the answer returned to the caller. The second answer will be ignored. I’ve used a minimal set of Node modules for this; http, url, request. Also I wrapped the translation API’s to provide a similar interface which allows me to call them with the same request objects. You can download the code here. In the below picture this simple scenario is illustrated. I’m not going to talk about the event loop and the call stack. Watch this presentation for a nice elaboration on those.

What does it do?

The service I created expects a GET request in the form of:

http://localhost:8000?text=犬&source=ja&target=en

In this case I’m translating the Japanese 犬 to the English dog.

The result of this call in the console is:

Server running at http://127.0.0.1:8000
0s, 0.054ms - Request start
0s, 147.451ms - Google response: dog
0s, 148.196ms - Response returned to caller
0s, 184.605ms - Systran response: dog

The result returned is:

{  "result": "dog",  "source": "Google" }

As you can see, Google is first to respond. The response from Google is returned to the client which does not have to wait for the result of Systran to come in.

If we slow down the returning of Google’s response with 1 second (setTimeout), we see the following:

Server running at http://127.0.0.1:8000
0s, 0.003ms - Request start
0s, 107.941ms - Systran response: dog
0s, 108.059ms - Response returned to caller
1s, 78.788ms - Google response: dog

These are just single requests thus timing values differ slightly.

The following result is returned:

{  "result": "dog",  "source": "Systran" }

How does it work?

Actually this setup is surprisingly simple using JavaScript and callbacks. The http module is used to create an HTTP server and listen on a port. The url module is used to parse the incoming request. The request module is used to create the GET request needed for SYSTRAN. See systran-translate.js (I’ve of course changed the API key ;). In the callback function of the server (which is called in the callback functions of the Google and Systran calls) I check if a response has already been returned. If not then I return it. If it has already been returned, I do nothing.

Below is a snippet from my main file which starts the server, calls the services and returns the response.

I’ve used the Google API as can be used with the node-google-translate-skidz module. Not much interesting to show here. To do the Systran translation, I’ve used the following code:

If you uncomment the console.log lines you can see the actual request which is being send such as: https://api-platform.systran.net/translation/text/translate?key=GET_YOUR_OWN_API_KEY&source=ja&target=en&input=%E7%8A%AC

%E7%8A%AC is of course 犬

Why is this interesting?

Suppose you are running a process engine which executes your service orchestration in a single thread. This process engine might in some cases not allow you to split your synchronous request/reply in a separated request and reply which might be received later, often making this a blocking call. When execution is blocked, how are you going to respond to another response arriving at your process? Also there are several timeouts you have to take into account such as maybe a JTA timeout. What happens if a reply never comes? This might be a serious issue since it might keep an OS thread blocked, which might cause stuck threads and might even hang the server if this happens often.

Through the asynchronous nature of Node.js, a scenario as shown above, suddenly becomes trivial as you can see from this simple example. By using a pattern such as this, you can get much better perceived performance. Suppose you have many clustered services which are all relatively lightweight. Performance of the different services might vary due to external circumstances. If you call a small set of different services at (almost) the same time, you can get a quick response to give to the customer. At the same time you might call services of which the answer might not be interesting anymore when it returns, increasing total system load.

In this example several things are missing such as correct error handling. You might also want to return a response if one of the services fails. Also, if the server encounters an error, the entire server crashes. You might want to avoid that. Routing has not been implemented to keep the example as simple as possible. For security you of course have your API platform solution.

For more information visit my session at OOW2016: Oracle Application Container Cloud: Back-End Integration Using Node.js

Share this on ..

The post Node.js: A simple pattern to increase perceived performance appeared first on AMIS Oracle and Java Blog.

↧

Node.js: My first SOAP service

August 11, 2016, 6:27 am

≫ Next: Application Container Cloud: Node.js hosting with enterprise-grade features

≪ Previous: Node.js: A simple pattern to increase perceived performance

Share this on ..

I created a simple HelloWorld SOAP service running on Node.js. Why did I do that? I wanted to try if Node.js was a viable solution to use as middleware layer in an application landscape. Not all clients can call JSON services. SOAP is still very common. If Node.js is to be considered for such a role, it should be possible to host SOAP services on it. My preliminary conclusion is that it is possible to host SOAP services on Node.js but you should carefully consider how you want to do this.

I tried to create the SOAP service in two distinct ways.

xml2js. This Node.js module allows transforming XML to JSON and back. The JSON which is created can be used to easily access content with JavaScript. This module is fast and lightweight, but does not provide specific SOAP functionality.
soap. This Node.js module provides some abstractions and features which make working with SOAP easier. The module is specifically useful when calling SOAP services (when Node.js is the client). When hosting SOAP services, the means to control the specific response to a call are limited (or undocumented)

Using both modules, I encountered some challenges which I will describe and how (and if) I solved them. You can find my sample code here.

xml2js

xml2js is relatively low level and specifically meant to convert XML to JSON and the other way around. It is SAX based which helps keeping the service calls non-blocking. Hosting a service is something you can control with express routers. You can see this here. The rest of the service code can be found here.

body-parser

The first challenge was getting the SOAP request in a usable form to my service. You have to make sure you have a correct body-parser configured to respond to the correct content type in the request. If you set the body-parser to text for type */* it will work nicely. I did this in my service JavaScript file since different services might require different body-parsers.

router.use(bodyParser.text({ type: '*/*' }));

Serving the WSDL

In order for a SOAP service to provide functionality, it helps greatly if it can serve its own WSDL. I respond to a GET request at ?wsdl with the WSDL file. You have to mind that ?wsdl is a GET query string and you have to check for it accordingly.

async

There were several asynchronous function calls in the processing of the request and creating the response of the service call. In order to make the code better readable, I used the async module waterfall function to provide something what looks like a synchronous chain and is more easy to read than several levels of nested callbacks.

Namespace prefix

Namespace prefixes are not static.

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl">
   <soap:Body>
      <tns:sayHelloResponse>
         <tns:greeting>Hello Maarten</tns:greeting>
      </tns:sayHelloResponse>
   </soap:Body>
</soap:Envelope>

Is the same as

<blabla:Envelope xmlns:blabla="http://schemas.xmlsoap.org/soap/envelope/" xmlns:blablabla="http://www.examples.com/wsdl/HelloService.wsdl">
   <blabla:Body>
      <blablabla:sayHelloResponse>
         <blablabla:greeting>Hello Maarten</blablabla:greeting>
      </blablabla:sayHelloResponse>
   </blabla:Body>
</blabla:Envelope>

Thus it is dangerous to query for a fixed namespace prefix in the request. To work around this, I stripped the namespace prefixes. This is of course not a solid solution if a certain node contains two elements with the same name but a different namespace, this will fail. Such a case however is rare and to be avoided.

parseString(req.body, { tagNameProcessors: [stripPrefix] }, cb);

Finding elements

When a JSON object is returned by xml2js, elements are put in an array of objects. It is dangerous to just pick the first item in the array, because that item might not always be the first (depending on optional elements for example and I’m not sure xml2js puts an object always at the same spot in the array). I created a little search function to get around that.

Creating the response

xml2js can create XML if you provide it with the correct JSON input. What I did to obtain this correct JSON message was to first use SOAP-UI to create a sample response message (by creating a mock service). I used this XML as input for xml2js parseString function. This gave me the correct JSON output which I could alter to give me an output message which contained processed input.

soap

Using the soap module, I managed to create a simple helloworld service with very little coding. You can find the code here. I was quite happy with it until I noticed the created response had some errors. I wanted:

<soapenv:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:examples:helloservice">
   <soapenv:Header/>
   <soapenv:Body>
      <urn:sayHelloResponse soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
         <greeting xsi:type="xsd:string">Hello Maarten</greeting>
      </urn:sayHelloResponse>
   </soapenv:Body>
</soapenv:Envelope>

I got:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl">
   <soap:Body>
      <tns:sayHelloResponse>
         <tns:greeting>Hello Maarten</tns:greeting>
      </tns:sayHelloResponse>
   </soap:Body>
</soap:Envelope>

Although it was pretty close, I did not manage to get the namespace of the sayHelloResponse and greeting element fixed. It is one of the dangers of abstractions. I noticed that the possible options for the client part were far more elaborate than for the server part. I did not find a suitable alternative for this module though.

Some things

The soap module provides an easy way to create simple SOAP services and clients. However, I found it lacking in flexibility (or documentation with samples). Some handy features though are support for basic authentication and WS-Security headers. Currently I would go for the xml2js option instead of trying to invest serious time in understanding how to get the soap module to do exactly what I want it to do. I would consider soap for SOAP clients though. WS-Security can also be implemented without the soap module (wssecurity module). Also the setup with the express routers and body parsers works quite nicely. I did not find an easy way to validate a message based on a WSDL.

Share this on ..

The post Node.js: My first SOAP service appeared first on AMIS Oracle and Java Blog.

↧

Application Container Cloud: Node.js hosting with enterprise-grade features

August 13, 2016, 8:34 am

≫ Next: Node.js and Oracle NoSQL Database

≪ Previous: Node.js: My first SOAP service

Share this on ..

Oracle’s Application Container Cloud allows you to run Java SE, Node.js and PHP applications (and more is coming) in a Docker container hosted in the Oracle Public Cloud (OPC). Node.js can crash when applications do strange things. You can think of incorrect error handling, blocking calls or strange memory usage. In order to host Node.js in a manageable, stable and robust way in an enterprise application landscape, certain measures need to be taken. Application Container Cloud provides many of those measures and makes hosting Node.js applications easy. In this blog article I’ll describe why you would want to use Oracle Application Container Cloud. I’ll illustrate this with examples of my experience with the product.

Forking (not a real cluster)

Node.js without specifically coded forking / clustering, runs in a single OS thread.This single thread uses a single CPU. You can fork processes/workers to use multiple CPU’s. Node.js provides (among other things) the core module cluster to do this. It depends on IPC between master and workers (which can be cumbersome to manually code). Also there is no easy way to graceful shutdown workers and restart them without downtime. StrongLoop (IBM) has developed modules such as strong-cluster-control and strong-store-cluster to make this more easy. If the master process fails however, you still have a problem.

Multiple Node.js instances (a real cluster)

If you want to provide true clustering over Node.js instances and not just for forks / child processes you need additional tooling; process managers. On the express site is a short list of the most popular ones. StrongLoop Process Manager, PM2 and Forever. For example StrongLoop Process Manager encapsulates features such as nginx load balancing, supervision as well as devops functions of build, deploy, monitor and scale on remote servers and Docker containers. I have not tried this though but can imagine this requires some setting up.

Application Container Cloud

Oracle Application Container Cloud provides out of the box with very little configuration, a set of clustering and stability features to allow Node.js to run in an enterprise landscape. If you want to get up and running quickly without thinking about many of these things, you should definitely look at Application Container Cloud.

Application Container Cloud provides an extensive user interface and API to allow you to do most required tasks easily. You do not need an expert to configure or operate Application Container Cloud (if you like, you can still hire me though ;).
Node.js instances (inside Docker containers) can be added / deleted on the fly. The load-balancer (also hosted in the OPC) is configured for you. Thus you get cluster functionality (important for scaling and high availability) over Node.js instances with almost no effort.
If one of the instances does not respond anymore, the load-balancer notices and sends requests to other instances. It does this by checking the root of your application.
When a Node.js instance crashes (by means of the Node.js process exiting), it is automagically restarted (you can confirm this by looking at the log files)

Creating and deploying an application

Creating an application is easy. You can use the provided API to easily create and manage an application or use the web interface. The format of the application to upload is also very straightforward. You upload a ZIP file with your complete application and a manifest.json file which can contain as little as for example:

{
"runtime":{
"majorVersion":"0.12"
},
"command": "node app.js",
"release": {},
"notes": ""
}

You can also add a deployment.json file which can contain variables you can use within your application. These variables can be updated with the web interface or the API and the application will be restarted to apply the new settings.

After multiple uploads and downloads, I noticed from the log files that my archive did not update. I changed the filename of the zip-file I uploaded and the problem did not occur anymore. Apparently, sometimes the uploaded file cannot be overwritten by a new upload (when doing certain actions at the same time probably). This can be dangerous. For a Continuous Delivery scenario I suggest using unique filenames to avoid this issue.

You can easily use a Maven build with the maven-assembly-plugin to create the archive to be deployed in case you want to embed this in a Continuous Delivery pipeline. You can of course also use the provided Developer Cloud Service. Read more about that here. There is a nice demo here.

Management

Application Container applications can be restarted, stopped, started, deleted. You can add and remove instances. You can not restart or start/stop a single instance. When a single instance does not respond anymore (on the application root URL), the load-balancer notices and does not send requests to that instance anymore. If you want to make it available again, you have to restart your entire application (or remove and add instances and hope the faulty one is removed and recreated). When adding instances, existing instances will keep running and for the new instances, when they are done creating, the load-balancer is updated. When removing instances, the load-balancer is also updated.

If the application ends the Node.js process, a new process will to be started immediately. This new process used the same IP and hostname during my tests. If I start/stop or restart the application, the hostname and IP get new values.

Log files

Log files can be downloaded from running applications.

If an application creation fails, you can also download the log files. See an example at ‘Accessing the API’ below. This can greatly help in debugging why the application cannot be created or started. Mostly the errors have been application errors. For example:

Aug 12 14:14:42: Listening on port: 8080
Aug 12 14:14:47: Request on url: /
Aug 12 14:14:47: /u01/app/app.js:49
Aug 12 14:14:47: if (<strong>req.url.startsWith</strong>("/oraclecloud")) {
Aug 12 14:14:47: ^
Aug 12 14:14:47: TypeError: undefined is not a function
Aug 12 14:14:47: at Server. (/u01/app/app.js:49:15)
Aug 12 14:14:47: at Server.emit (events.js:110:17)
Aug 12 14:14:47: at HTTPParser.parserOnIncoming [as onIncoming](_http_server.js:492:12)
Aug 12 14:14:47: at HTTPParser.parserOnHeadersComplete (_http_common.js:111:23)
Aug 12 14:14:47: at Socket.socketOnData (_http_server.js:343:22)
Aug 12 14:14:47: at Socket.emit (events.js:107:17)
Aug 12 14:14:47: at readableAddChunk (_stream_readable.js:163:16)
Aug 12 14:14:47: at Socket.Readable.push (_stream_readable.js:126:10)
Aug 12 14:14:47: at TCP.onread (net.js:540:20)

Do mind that the Node.js version running in the Application Container Cloud currently is 0.12.14. In this version for example you have to start Node.js with the –harmony switch if you want to use startsWith on the String object. My bad obviously.

Managing access

Even though there is an extensive user interface for management and monitoring of the Node.js application and also an extensive API (see more about about that below), the load-balancer configuration is not directly accessible. You can for example not configure a specific load-balancer probe or procedure to check whether the instance is up again. Also I did not find a way to easily secure my Node.js services to avoid them from being called by just anyone on the internet (I might have missed it though). It would be nice if for example the Node.js instance was behind an API platform or even just a configurable firewall.

Out of memory issue

I have found one situation which is not dealt with accordingly. I tried the following piece of code here.

This exposes 3 endpoints. /crashmenot, /crashmebykillingprocess and /crashmebyoutofmemory. I had created two instances of my application. By the hostname in the response message I could see which instance I was accessing.

Killing the server by out of memory however (GET request to /crashmebyoutofmemory) gave me locally the following exception in a local console (among some other things):

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

When running this in Application Container Cloud I did not see the out of memory error. I did specify though that the instance should have 1Gb of RAM (the minimum allowed). An instance can currently be assigned as much as 20Gb of RAM.

Killing the process by causing out of memory, did not cause the instance to be auto restarted. The load-balancer however send me to the instance which was still working though. When I killed all instances with out of memory, it did not automatically recover. After the manual restart, the instances worked again. When the instances were running, a manual restart did not cause noticeable downtime.

Accessing the API

Application Container Cloud has an extensive and well documented API. In order to debug issues it is nice you can access the Application Container Cloud API by first determining the URL to call.

In this example my application is called CrashDummy7 and my Identity domain is accstrial. You can use an API call like (of course replace user and pass with actual values):

curl -k -i -X GET -u user:pass -H "X-ID-TENANT-NAME:accstrial" https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7

HTTP/1.1 200 OK
Date: Fri, 12 Aug 2016 18:10:57 GMT
Server: Oracle-Application-Server-11g
Content-Length: 1335
X-ORACLE-DMS-ECID: 005EUFDqWVE3z015Rvl3id00068i000ZBB
X-ORACLE-DMS-ECID: 005EUFDqWVE3z015Rvl3id00068i000ZBB
X-Frame-Options: DENY
Content-Language: en
Content-Type: application/json

{"identityDomain":"accstrial","appId":"a2c7998b-c905-466a-b89f-bb9b8c4044dd","name":"CrashDummy7","status":"RUNNING","createdBy":"maarten.smeets@amis.nl","creationTime":"2016-08-12T17:43:17.245+0000","lastModifiedTime":"2016-08-12T17:43:17.202+0000","subscriptionType":"MONTHLY","instances":[{"name":"web.1","status":"RUNNING","memory":"1G","instanceURL":"https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7/instances/web.1"},{"name":"web.2","status":"RUNNING","memory":"1G","instanceURL":"https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7/instances/web.2"}],"runningDeployment":{"deploymentId":"5a9aef17-8408-494c-855e-bc618dfb69e9","deploymentStatus":"READY","deploymentURL":"https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7/deployments/5a9aef17-8408-494c-855e-bc618dfb69e9"},"lastestDeployment":{"deploymentId":"5a9aef17-8408-494c-855e-bc618dfb69e9","deploymentStatus":"READY","deploymentURL":"https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7/deployments/5a9aef17-8408-494c-855e-bc618dfb69e9"},"appURL":"https://apaas.europe.oraclecloud.com/paas/service/apaas/api/v1.1/apps/accstrial/CrashDummy7","webURL":"https://CrashDummy7-accstrial.apaas.em2.oraclecloud.com"}

You can download logfiles such as:

By doing:

curl -u user:pass -k -i -X GET -H "X-ID-TENANT-NAME:accstrial" https://accstrial.storage.oraclecloud.com/v1/Storage-accstrial/_apaas/forwarder8/d177756e-399e-414f-bc9b-d2bb6e698ae6/logs/web.1/f0f62da1-b240-4251-aabe-9c1244069ed6/server.out.zip -o file.zip --raw

Here my application was called forwarder8.

Finally

I have explained when and why it is worthwhile to check out Oracle Application Container Cloud. Also I’ve shown some samples on how easy it it to use. I’ve described some of my experiences and have given some tips on how to efficiently work with it.

I have not touched on for example the easy integration with Oracle database instances running inside your cloud. The Oracle DB driver is supplied and configured for you in your Node.js instances and can be used directly with only simple configuration from the UI or API.

If you want to know more I suggest watching the following presentation from Oracle here. If you want to know more about the Oracle Db driver for Node.js, you can watch this presentation. And using the driver inside Application Container Cloud service here.

Great stuff!

Share this on ..

The post Application Container Cloud: Node.js hosting with enterprise-grade features appeared first on AMIS Oracle and Java Blog.

↧

Node.js and Oracle NoSQL Database

August 29, 2016, 5:56 am

≫ Next: One of the many nice new features in 12c database: code based access control

≪ Previous: Application Container Cloud: Node.js hosting with enterprise-grade features

Share this on ..

Oracle NoSQL Database is an interesting option to consider when you want a schemaless, fast, scale-able database which can provide relaxed (eventual) consistency. Oracle provides a Node.js driver for this database. In this blog I’ll describe how to install Oracle NoSQL database and how to connect to it from a Node.js application.

The Node.js driver provided by Oracle is currently in preview version 3.3.7. It uses NoSQL client version 12.1.3.3.4 which does not work with 4.x versions of NoSQL database, so I downloaded Oracle NoSQL Database, Enterprise Edition 12cR1 (12.1.3.3.5) from here (the version number was closest to the version number of the client software).

NoSQL installation

To get NoSQL Database up and running, I followed the steps described in the installation manual here. This was quite educational as it showed me the parts a NoSQL database consists of. If you want to do a quick installation, you can do the steps described here. Do mind that the {} variable references do not work in every shell. On Oracle Linux 7.2 (inside Virtualbox) I did the following. First download kv-ee-3.3.5.zip here (previous versions on the Oracle NoSQL download page).

I made sure a Java was installed. I used Oracle Java 1.8 64 bit. I created the user oracle

mkdir -p /home/oracle/nosqldb/data
mkdir -p /home/oracle/nosqldb/root
cd /home/oracle/nosqldb
unzip kv-ee-3.3.5.zip

I updated /etc/environment

KVHOME="/home/oracle/nosqldb/kv-3.3.5"
KVDATA="/home/oracle/nosqldb/data"
KVROOT="/home/oracle/nosqldb/root"

Logout and login to make the environment active

Next I created an initial boot config

java -jar $KVHOME/lib/kvstore.jar makebootconfig -root $KVROOT -store-security none -capacity 1 -harange 5010,5030 -admin 5001 -port 5000 -memory_mb 1024 -host localhost -storagedir $KVDATA

I started the Oracle NoSQL Database Storage Agent(SNA).

nohup java -Xmx256m -Xms256m -jar $KVHOME/lib/kvstore.jar start -root $KVROOT &

And I configured it:

java -Xmx256m -Xms256m -jar $KVHOME/lib/kvstore.jar runadmin -port 5000 -host localhost <<EOF
configure -name mystore
plan deploy-zone -name "LocalZone" -rf 1 -wait
plan deploy-sn -zn zn1 -host localhost -port 5000 -wait
plan deploy-admin -sn sn1 -port 5001 -wait
pool create -name LocalPool
show topology
pool join -name LocalPool -sn sn1
topology create -name topo -pool LocalPool -partitions 10
topology preview -name topo
plan deploy-topology -name topo -wait
show plan
EOF

You can test if it is running by executing:

java -Xmx256m -Xms256m -jar $KVHOME/lib/kvstore.jar ping -port 5000 -host localhost

Oracle NoSQL Database basics

NoSQL Database has built-in load balancing, sharding and several other features related to high availability quite clearly integrated as an essential part of the software and not as some kind of add-on. Read more about it here.

Also different consistency models are available. You can sacrifice immediate consistency to gain more performance. Read more about that here.

It provides an Admin console to look at the topology and execution of plans. The console does not allow you to do actual changes to the configuration. For that you can use a CLI.

It also allows you to browse log files

And look at performance details for specific nodes.

For development, kvlite available. It is a simple single node store which can be used locally. Read more here. When using kvlite, you do not need to do the configuration as described in the installation.

Node.js application

Installation of the Node.js module is easy. Just do a npm install nosqldb-oraclejs. The NoSQL Node.js driver page gives a piece of code which you can use to test your application. The default installation of NoSQL however as described above causes a port conflict with the proxy server which is started. This port conflict is not immediately clear as it gives you the below exception.

{ NoSQLDB-ConnectionError: Error with NoSQL DB Connection: Error verifying the proxy connection
NoSQLDB-ConnectionError: Error with NoSQL DB Connection: Connection timeout
at /home/oracle/nodejsnosql/node_modules/nosqldb-oraclejs/lib/store.js:277:25
at Timeout._onTimeout (/home/oracle/nodejsnosql/node_modules/nosqldb-oraclejs/lib/store.js:181:7)
at tryOnTimeout (timers.js:228:11)
at Timer.listOnTimeout (timers.js:202:5)

I changed the port and some log settings and used this test application. When I run it with node app.js I get the following output:

[2016-08-16 19:05:02.717] [INFO] [at Object.startProxy (/home/oracle/nodejsnosql/node_modules/nosqldb-oraclejs/lib/proxy.js:353:10)][PROXY] Start proxy

Connected to store
Table is created.
Inserting data...
Reading data...
Writing row #0
Writing row #1
Reading row #0
{ id: 0, name: 'name #0' }
Reading row #1
{ id: 1, name: 'name #1' }
Closing connection...
[2016-08-16 19:05:09.630] [INFO] [at Store.close (/home/oracle/nodejsnosql/node_modules/nosqldb-oraclejs/lib/store.js:299:12)]Store close

Store connection closed.
Shutting down proxy.
Proxy closed.

How does this Node.js module work?

I read that the proxy translates network activity between the Node.js module and the Oracle NoSQL Database store. The proxy can be spawned as a child process (JVM) from the Node.js module. A JavaScript Thrift client (see Thrift protocol) has been generated with the Apache Thrift compiler which is used by the module to communicate with the proxy. The proxy then uses kvclient to connect to the database. This would be something like the image below.I wondered what the performance cost would be of having a Java proxy and two translations between the Node.js module and the NoSQL database. It would be interesting to compare the bare RMI Java client performance with the Node.js module performance and to compare performance of a query executed from within the database with a query executed from outside by the RMI kvclient to determine the performance cost of the different hops/translations. I can understand the usage of Thrift though since it provides a relatively easy way to create clients in different languages.

Share this on ..

The post Node.js and Oracle NoSQL Database appeared first on AMIS Oracle and Java Blog.

↧

One of the many nice new features in 12c database: code based access control

September 5, 2016, 6:00 am

≫ Next: Login with OAuth2.0 using AngularJS 1.5 componentrouter and Node.js

≪ Previous: Node.js and Oracle NoSQL Database

Share this on ..

Topic of this blog is a nice new feature in 12c, not the plsql package I built that’s using it. So here’s the story..

For one of our customers we needed to have a simple schema comparison tool that would be able to check, as part of application deployment activity, whether there is any discrepancy between the schema that was used to build the application (.ear) files and the target schema of the deployment. Of course there are quite a few schema comparison tools out there in the wild, including those from Oracle corp like what’s offered in sql developer and in cloud control, but none met our requirement that it must be possible to ship this compare-schema-code as yet another deployment artifact and have it run automatically as part of the deployment.

After some consideration we decided to look into the SYS owned DBMS_METADATA PLSQL package which is available in each Oracle database. Quote from doc: “The DBMS_METADATA package provides a way for you to retrieve metadata from the database dictionary as XML or creation DDL and to submit the XML to re-create the object.” We wanted to use the packaged procedure DBMS_METADATA.GET_DDL to retrieve the DDL of each and every object in a source database schema (e.g. the one used for building the application deployment artifacts), store this DDL in a table and then export and ship this table to the target environment and use it for comparison. The format of our schema-comparison tool should be a PLSQL packaged procedure with EXECUTE privilege granted to those database users who need it. The Metadata API (mdAPI) has an Oracle dictionary that contain lots of views (starting with “KU$_..”) with SELECT privilege on them granted to PUBLIC and that also contain a CURRENT_USERID checking security clause and a check whether or not the invoker of the view has the SELECT_CATALOG_ROLE. For example in the definition of ku$_table_objnum_view view (see ?/rdbms/admin/catmetviews.sql)


AND (SYS_CONTEXT('USERENV','CURRENT_USERID') IN (o.owner_num, 0)

OR EXISTS ( SELECT * FROM sys.session_roles WHERE role='SELECT_CATALOG_ROLE' ))

this filter translates to: invoker of DBMS_METADATA API must be

the owner of the objects whose metadata is being retrieved
user SYS
a user having the SELECT_CATALOG_ROLE turned on

so to be able to retrieve an object’s metadat using DBMS_METADATA plsql package one of the following needs to be true:

our solution plsql code needs to be in the same database schema as the application objects
the userid running our plsal code needs to be SYS (CURRENT_USERID=0)
when the stored plsql code is run the SELECT_CATALOG_ROLE needs to be turned on

As we want to make our plsql procedure available in a central schema, only the last option is viable. The SYS owned DBMS_METADATA package itself runs with invokers rights (AUTHID CURRENT_USER) and as such relies on the privileges in the security context of the invoker. Since the DBMS_METADATA plsql package is called by our own plsql package TBS_UTIL, i.e. the invoker of DBMS_METADATA, it depends on whether or not our own TBS_UTIL package is defined with definer rights (AUTHID DEFINER) or invoker rights. If our package TBS_UTIL would be defined with invoker’s rights then any user with EXECUTE privilege on TBS_UTIL and executing it should have the SELECT_CATALOG_ROLE enabled in its session. If, however, we decided to create our TBS_UTIL package with DEFINER rights then we would be in trouble since the SELECT_CATALOG_ROLE role that the owner of TBS_UTIL package must have cannot be used when TBS_UTIL is being executed. Since we don’t want to grant the SELECT_CATALOG_ROLE role to every database user that needs to be able to execute the TBS_UTIL package we need to solve our privilege problem. Fortunately, and that’s the whole topic of this blog, does Oracle database 12 c have this nice new fature called CBAC.

What’s CBAC again?..Remember RBAC (Role Based Access Control) where roles granted to users determine the sets of permissions to be granted those users and ABAC (Attribue based Access Control) where runtime evaluation of (dynamically changing) attributes such as time of day, location or strength of authentication method are used in access control decisions ? Here’s another sibling: CBAC (Code Based Access Control). The role privileges are being granted to the plsql code next to being granted to the definer of the plsql code, so this means following two GRANT statements should be run by user SYS


-- 1st : grant the role to the definer of the plsql package

GRANT SELECT_CATALOG_ROLE TO TBS_UTIL;

-- 2nd : issue the CBAC statement itself
GRANT SELECT_CATALOG_ROLE TO PACKAGE UTILITY.TBS_UTIL;

If the definer of plsql package TBS_UTIL, user UTILITY, hasn’t been granted the SELECT_CATALOG_ROLE role the CBAC statement will fail as you can see below:

GRANT SELECT_CATALOG_ROLE TO PACKAGE UTILITY.TBS_UTIL
 *
 ERROR at line 1:
 ORA-01924: role 'SELECT_CATALOG_ROLE' not granted or does not exist

but after we have granted the SELECT_CATALOG_ROLE role back to the owner UTILITY of the plsql package TBS_UTIL:


GRANT SELECT_CATALOG_ROLE TO UTILITY;
Grant succeeded.

the grant can be given

GRANT SELECT_CATALOG_ROLE TO PACKAGE UTILITY.TBS_UTIL;
Grant succeeded.

and the user C1 can call the packaged procedure UTILITY.TBS_UTIL.GET_SOURCE_DDL without any privilege issues. Without the CBAC statement the user C1 gets an ORA-31603 error ( “object \”%s\” of type %s not found in schema \”%s\””) when invoking the packaged procedure UTILITY.TBS_UTIL.GET_SOURCE_DDL

In the example shown below the central schema containing our solution plsql package TBS_UTIL is called UTILITY.
DBMS_METADATA is owned by SYS and created with invoker’s rights (AUTHID CURRENT_USER)
Our plsql package can be identified using UTILITY.TBS_UTIL and has a packaged procedure GET_SOURCE_DDL, the TBS_UTIL package is created with definer rights (AUTHID DEFINER)
EXECUTE privilege on the TBS_UTIL package has been granted to database user C1
The schema whose METADATA we are after in this example is called PM_OWNER and it contains a packaged procedure with a plsql function called GETBIN:


C1&amp;gt; exec utility.tbs_util.get_source_ddl('PM_OWNER')

ORA-06502: PL/SQL: numeric or value error
ORA-06512: at "UTILITY.TBS_UTIL", line 83
ORA-06512: at "UTILITY.TBS_UTIL", line 478
ORA-31603: object "GETBIN" of type FUNCTION not found in schema "PM_OWNER"
ORA-06512: at "UTILITY.TBS_UTIL", line 541
ORA-06512: at line 1

What if definer user UTILITY suddenly would wear a blackhat one morning and decide to recreate the package code TBS_UTIL using CREATE OR REPLACE clause (thereby preserving any grants) with something nasty that became possible due to the privileges associated with the SELECT_CATALOG_ROLE role? A security officer’s nightmare, that’s what it would be. Fortunately it turns out that the CBAC statement needs to be repeated when Oracle finds out that the code, that has been granted a role, has changed. It will fail until user SYS has granted the SELECT_CATALOG_ROLE role to the plsql package TBS_UTIL back again.

More on Code Based Access Control in the Oracle documentation can be found here.

Happy CBAC’ing!

Share this on ..

The post One of the many nice new features in 12c database: code based access control appeared first on AMIS Oracle and Java Blog.

↧

Login with OAuth2.0 using AngularJS 1.5 componentrouter and Node.js

September 5, 2016, 10:50 am

≫ Next: First steps in deploying Node.js on Heroku

≪ Previous: One of the many nice new features in 12c database: code based access control

Share this on ..

For one of my projects I need my users to log in to an 3rd party api (in my case strava) using oauth2.0
Framework of choice: AngularJS 1.5 in plain javascript, with the Angular 1.5 componentrouter. Since I have apikeys which I do not want to be known the the client, I also need my own mini server using NodeJS, express and a bit of request.

OAuth2 in a nutshell:

Someone wants to log in to my application and presses the “Login button”. When the user presses the button there is redirect to the 3rd party website where the user actually logs in. The redirect is done in the client, because when I do a res.redirect() to the url in the backend with express, I get an CORS error.
The user is redirect to my callback page with a code in the url. I take this code to Strava to trade it for an access token so the user can get data. The trading also makes use of my api key so this is all done in the server.

How to do this:

I have a maincomponent with a login button, when clicking this button the user is redirected to the 3rd party login.

angular.module('myapp',['ngComponentRouter',"how","search"])
.value('$routerRootComponent', 'app')
.component('main', {
  template:
 ' <button ng-click="$ctrl.login()" class="stravaButton"></button>' +
 '<nav>' +
 ' <a ng-link="[\'How\']">How page</a>' +
 ' <a ng-link="[\'Search\']">Search page</a>' +
 '</nav>' +
 '<ng-outlet></ng-outlet>',
  controller:MainComponent,
  $routeConfig: [
    {path: '/how', name: 'How', component: 'how', useAsDefault: true},
    {path: '/search', name: 'Search', component: 'search' }
   ]
  });

  function MainComponent($window){
    var $ctrl = this;
    $ctrl.login = function(){
      $window.location.href = "https://www.strava.com/oauth/authorize?client_id=<MY_CLIENT_ID>+&response_type=code&amp;amp;redirect_uri=http://localhost:3000/callback";
    }
 }

As we can see in the $window.location.href, after the user is logged in , the user will be redirected to http://localhost:3000/callback For this we make a callback component: which we add to the $routeConfig in the mainComponent:

$routeConfig: [
    {path: '/how', name: 'How', component: 'how', useAsDefault: true},
    {path: '/search', name: 'Search', component: 'search' },
    {path: '/callback', name: 'Callback', component: 'callback' }
   ]
  });

Now for the callback component. In this component we get the code that was send from the 3rd party using location.search().code. This login component does a request to the backend to exchange the code we got for an access token.

This is how the component looks like:

angular.module('callback', [])
  .component('callback', {
    controller:CallbackComponent
  })

function CallbackComponent($location,loginService){

    var code = $location.search().code;
    loginService.loginUser(code).then(function(data){
    })
}

This is how our (clientside) loginService looks like:

angular.module('myapp')
    .factory('loginService',function($http, $q){
   var loginUrl = '/api/login';

   loginUser = function(code){
         var login = $http.get(loginUrl + '/' + code).then(function(response){
            var data = response.data;
             if(data.access_token !== undefined && data.access_token !== null){
                //success
             }
         })
         return login;
    }
    return {
      loginUser:loginUser
     };
});

Note: for the location.search() to work we need to enable HTML5 in the configuration.

angular.module('myapp')
    .config(function($locationProvider){
    $locationProvider.html5Mode(true);
});

In the backend we send a POST request to the endpoint that will give us the token. We send the client id and client secret with the POST. If everything goes well, we receive the token which we send back to the client.

var express = require('express');
var request = require('request')
var app = express.Router();

app.get('/:code', function(req,res){
  var mycode = req.params.code;
  var endpoint = 'https://www.strava.com/oauth/token';

    var url = endpoint
        , options = {
            url: url
            , method: 'POST'
            , json: true
            ,form : {
                client_id : "MYCLIENTID" ,
                client_secret : "<MY_CLIENT_SECRET>" ,
                code : mycode
            }
        };

    request(options, function (err, response, payload) {
        if (err  || response.statusCode !== 200) {
            console.log('api call error');
            console.log(err);
        }
        res.status(response.statusCode).send(payload)
    });

});
module.exports = app;

When the component was succesfull we store the accesToken in the localStorage using ngStorage, so the user does not need to login again when he/she comes back to the website. I made a separate clientside service for that, called the UserService, which is called in the login component.

Changes to the login component:

angular.module('myapp')
    .factory('loginService',function($http, $q, userService){
    var loginUrl = '/api/login';

   loginUser = function(code){

         var login = $http.get(loginUrl + '/' + code).then(function(response){
            var data = response.data;
             if(data.access_token !== undefined && data.access_token !== null){
                userService.setToken(data.access_token)
             }
         })
         return login;
    }
    return {
      loginUser:loginUser
     };
});

The userservice:

angular.module('myapp')
    .factory('userService',function($http, $localStorage){

    isLoggedIn = function(){
        return($localStorage.user !== null);
    }

    setToken = function(token){
        $localStorage.userToken = token

    }
    getToken = function(){
        return $localStorage.userToken;
    }
    return {
      getToken:getToken,
      setToken:setToken,
      isLoggedIn:isLoggedIn
     };
});

ngStorage also has the advantage that we can put an ng-if on some DOMelements that watches the storage. In that case we can make some elements only appear if the user is logged in (or not), like the login button and items in the menu I only want to show if the user is logged in. Let’s do that as well in the mainComponent.

"use strict";
angular.module('myapp', ['ngComponentRouter', 'ngStorage',"how","callback","search"])
  .value('$routerRootComponent', 'app')

  .component('app', {
    template:
    '<button ng-if="$ctrl.$storage.userToken === undefined" ng-click="$ctrl.login()">LOG IN</button>' +
    '</div>' +
    '<nav>' +
    ' <a ng-link="[\'How\']">How does it work</a>' +
    ' <a ng-if="$ctrl.$storage.userToken !== undefined" ng-link="[\'Search\']">Search</a>' +
    '</nav>' +
    '<ng-outlet></ng-outlet></div>',
    controller: MainComponent,
    $routeConfig: [
      { path: '/how', name: 'How', component: 'how', useAsDefault: true },
      { path: '/search', name: 'Search', component: 'activities' },
      { path: '/callback', name: 'Callback', component: 'callback' }
    ]
  });


function MainComponent($window, $localStorage) {

  var $ctrl = this;

  $ctrl.$storage = $localStorage;

  $ctrl.login = function () {
    $window.location.href = "https://www.strava.com/oauth/authorize?client_id=<MY_CLIENT_ID>+&response_type=code&redirect_uri=http://localhost:3000/callback";

  }

Also, if the login was succesfull we redirect the user to the component of the website where the data is shown we are getting from the 3rd party api.

angular.module('myapp', [])
  .component('callback', {
    controller:LoginComponent
  })

function LoginComponent($location,loginService,$rootRouter){

    var code = $location.search().code;
    loginService.loginUser(code).then(function(data){
      $rootRouter.navigate(['Search']);
    })
};

And that is it, we can now login in with a 3rd party API using OAuth2.0, the Angular1.5 Componentrouter and NodeJS
A full working example can be found at my Github page, don’t forget to put your own Strava api keys in, or change it to match your 3rd party api!

Blogs I found helpfull:
http://jasonwatmore.com/post/2015/03/10/AngularJS-User-Registration-and-Login-Example.aspx
https://www.codementor.io/nodejs/tutorial/how-to-implement-twitter-sign-expressjs-oauth
https://www.thepolyglotdeveloper.com/2015/03/using-oauth-2-0-in-your-web-browser-with-angularjs/

Share this on ..

The post Login with OAuth2.0 using AngularJS 1.5 componentrouter and Node.js appeared first on AMIS Oracle and Java Blog.

↧

First steps in deploying Node.js on Heroku

September 6, 2016, 12:49 pm

≫ Next: Oracle NoSQL Database 4.x and the Node.js driver 3.x

≪ Previous: Login with OAuth2.0 using AngularJS 1.5 componentrouter and Node.js

Share this on ..

Recently, I used Heroku to deploy a website in the great wide open!
In this blogpost I want to tell about my first steps using Heroku, including the rookie mistakes I made: so you don’t have to make them.

Heroku in a nutshell:

“Heroku is a cloud platform that lets companies build, deliver, monitor and scale apps”
You can deploy your NodeJS application (or Rails or..) there with a simple git push.
The app will be build, deployed and can then be accessed via a link in the form of yourappname.herokuapp.com.

Beside the fact that it is easy to use, you also get a decent amount of online-time for free and it has a very good tutorial and how to get started which can be read on the website of Heroku.

Let’s start

First of all you need to make an account on Heroku and install the Heroku toolbelt. If you have done that you can login to Heroku in the commandline with

 heroku login

Enter your credentials and you are logged in.
Then make a new project using.

 heroku create

It will make you an app with an random appname. If you want to give the app your own name you can put the name of your application behind it. Like:

 heroku create amis-demo-app

The application, make sure it uses git and has a package json. So:

git init
npm init

Name the app however you want. Don’t forget to .gitignore the node_modules.

Then make a little node server with express named server.js.
My first application is super easy: go to the page and it will send a “Hello world”.

var express = require('express');
var app = express();

app.get('/', function (req, res) {
  res.send('Hello World!');
});

app.listen(3000, function () {
  console.log('Example app listening on port 3000!');
});

Run it locally to see the Hello world! in the browser.

Now we want to put this on Heroku.
First we need a start script so Heroku knows what to run.
Add it to the script in the package.json:

 "scripts": {
    "start":"node server.js"
  }

Then add the engines. The what? The engines!
I locally use npm 3.10.3 and node v6.3.0. But Heroku doesn’t know that and has default versions of npm and Node. To make sure that Heroku uses the same version as we are using locally we set the engines in the package.json:

 "engines": {
        "node": "6.3.0",
        "npm": "3.10.3"
    }

Deploying the application

Now that we have made sure Heroku uses the same engines as we are we only need to link it to the remote Heroku git url which you can find in your dashboard.
Add the remote with git, add and commit everything and push to Heroku, since it is the first push, we need to set the upstream branch. We can use a normal git push heroku after this first one.
Don’t forget to .gitignore the node_modules as Heroku will do the npm install!

 git remote add heroku https://git.heroku.com/amis-demo-app.git //Use your own app url here!
git add -A
git commit -m "init"
git push --set-upstream heroku master

Let ‘s go to the website and see how our “Hello world” is doing. Uh oh we see this:

application-error
This was my first real mistake. Luckily you can easily find out what happened by using heroku log in the console like this:

 heroku logs

It shows me this:
heroku-logs
With the most important message being:

 Error R10 (Boot timeout) -> Web process failed to bind to $PORT within 60 seconds of launch

With a bit of googling on that error I found out the problem:
Heroku dynamically assigns your my aplication to a port, so you can’t set the port to a fixed number, which I did!. Heroku adds the port to the environment, so we need to pull it from there. We change the app.listen to this:

app.listen(process.env.PORT || 3000, function () {
  console.log('Example app listening');
});

You can also do this using the Heroku procfile, check out there website for info on that one.

Now we commit and push and see what happens.
There it is!
Hello-word-her

Serving a frontend

The app is working. But I don’t want just a backend serving Hello world. I want a real html frontend, displaying a the AMIS logo!
Let’s make a public folder with an index.html and a images folder containing the image amislogo.jpg .

<html>
    <base href="/">
    <head></head>
    <body>
        <div>Hello</div>
        <img src="/images/amislogo.JPG"></img>
    </body>
</html>

We let the backend serve the frontend:

var express = require('express');
var path = require('path')
var app = express();

app.use(express.static(path.join(__dirname ,'/public')));

app.listen(process.env.PORT || 3000, function () {
  console.log('Example app listening');
});

If you run it locally you will now see a website with an AMIS image.

Add and commit all the files, push them to Heroku, wait for the build, open the app and…
no-image
My second mistake. What happened, where is the image? If you have a keen eye you saw the mistake I made which I can blame to my Windows sloppiness. Heroku is Linux and Linux is case sensitive. Locally I use Windows which doesn’t care so if I type JPG or JpG or jpg, it will show my image. Linux does care so be carefull! If we change the amislogo.JPG to amislogo.jpg in the index.html it will show the image, just as planned :).
yes-image
One last thing:
If you have bower dependencies on the frontend in the public folder instead of node_modules at the root, you can make Heroku automatically install those too by adding a postinstall script in the package json like this:

"scripts": {
    "start": "node server.js",
    "postinstall": "bower install"
  }

Code of this app can be found on my Github and of course you can view this beautifull app on amis-demo-app.herokuapp.com/
Resources:
https://devcenter.heroku.com/

Share this on ..

The post First steps in deploying Node.js on Heroku appeared first on AMIS Oracle and Java Blog.

↧

Oracle NoSQL Database 4.x and the Node.js driver 3.x

October 3, 2016, 5:58 am

≫ Next: Node.js application writing to MongoDB – Kafka Streams findings read from Kafka Topic written to MongoDB from Node

≪ Previous: First steps in deploying Node.js on Heroku

Share this on ..

There are two ways you can access Oracle NoSQL database from a Node.js application. These are illustrated below. You can use the nosqldb-oraclejs driver and you can use Oracle REST Data Services.

In my previous blog post I illustrated how you can access Oracle NoSQL database by using the nosqldb-oraclejs driver. I encountered an issue when using the NoSQL database version 12R1.4.0.9 with the currently newest available Node.js driver for NoSQL database nosqldb-oraclejs 3.3.15.

INFO: PS: Connect to Oracle NoSQL Database mystore nodes : localhost:5000
Aug 15, 2016 10:10:06 PM oracle.kv.proxy.KVProxy
INFO: PS: … connected successfully
Exception in thread “main” oracle.kv.FaultException: Unable to get table metadata:Illegal character in: SYS$IndexStatsLease (12.1.3.3.4)
Fault class name: org.apache.avro.SchemaParseException
Remote stack trace: org.apache.avro.SchemaParseException: Illegal character in: SYS$IndexStatsLease
at org.apache.avro.Schema.validateName(Schema.java:1068)
at org.apache.avro.Schema.access$200(Schema.java:79)

The nosqldb-oraclejs driver currently does not support NoSQL database versions 4.x and only 3.x. There is a workaround available though kindly provided by Oracle.

nosqldb-oraclejs 3.x and NoSQL database 4.x

The nosqldb-oraclejs driver creates an instance of a proxy server by starting a JVM. This proxy server uses kvclient.jar to connect to the database using RMI. The kvclient.jar from the 3.x version of the driver does not like talking to a 4.x version of the database.

The first try was replacing the kvclient.jar from the 3.x version of the driver with the kvclient.jar supplied with 4.x of the database. This allowed the proxy-server to start, but when connected to it, it gave the following error:

[ERROR] [at Object.getProxyError (/home/oracle/nosqlnode/nodejssamples/nosql/node_modules/nosqldb-oraclejs/lib/errors.js:165:10)]Capturing proxy error: TProxyException: org/antlr/v4/runtime/RecognitionException

In order to get the proxy-server working completely, I had to download the Java driver from the Oracle download site: kv-client-4.0.9.zip (version corresponding to the version of the database). In this zip file was a directory kv-4.0.9/lib which contained several other jar files which were also required to get the proxy-server working. When I copied the contents of that folder to my node_modules/nosqldb-oraclejs/kvproxy folder (the same folder as kvproxy.jar) and restarted the proxy server, I could use the driver. I have not checked thoroughly though if all functionality works as expected. I used the example which is available here to check if it works.

Finally

A drawback of using this solution is that you manually update driver code after it has already been fetched from NPM. When you remove the node_modules folder and do an npm install, the jar files will be reverted to their previous 3.x version. This can be an issue in a continuous delivery environment. Probably the best solution for this is using your own repository manager to proxy remote npm registries such as https://registry.npmjs.org. For example by using JFrog Artifactory or Sonatype Nexus in which you have full control over the artifacts. Here you can create your own version of the driver which is already bundled with the correct jar files. This has the additional benefit of improving performance by not requiring every client to download the libraries from the internet but from a local repository proxy. This also allows you to have more control over which libraries can be used and some visibility of which are being used.

Share this on ..

The post Oracle NoSQL Database 4.x and the Node.js driver 3.x appeared first on AMIS Oracle and Java Blog.

↧

Node.js application writing to MongoDB – Kafka Streams findings read from Kafka Topic written to MongoDB from Node

February 20, 2017, 6:35 am

≫ Next: DIY Parallelization with Oracle DBMS_DATAPUMP

≪ Previous: Oracle NoSQL Database 4.x and the Node.js driver 3.x

MongoDB is a popular, light weight, highly scalable, very fast and easy to use NoSQL document database. Written in C++, working with JSON documents (stored in binary format BSON), processing JavaScript commands using the V8 engine, MongoDB easily ties in into many different languages and platforms, one of which is Node.JS. In this article, I describe first of all how a very simple interaction between Node.JS and MongoDB can be implemented.

Then I do something a little more challenging: the Node.JS application consumes messages from an Apache Kafka topic and writes these messages to a MongoDB database collection, to make the results available for many clients to read and query. Finally I will show a little analytical query against the MongoDB collection, to retrieve some information we would not have been able to get from the plain Kafka Topic (although with Kafka Streams it just may be possible as well).

You will see the Mongo DB driver for Node.JS in action, as well as the kafka-node driver for Apache Kafka from Node.JS. All resources are in the GitHub Repo: https://github.com/lucasjellema/kafka-streams-running-topN.

Prerequisites

Node.JS is installed, as is MongoDB.

Run the MongoDB server. On Windows, the command is mongod, optionally followed by the dbpath parameter to specify in which directory the data files are to be stored

mongod --dbpath c:\node\nodetest1\data\

For the part where messages are consumed from a Kafka Topic, a running Apache Kafka Cluster is available – as described in more detail in several previous articles such as https://technology.amis.nl/2017/02/13/kafka-streams-and-nodejs-consuming-and-periodically-reporting-in-node-js-on-the-results-from-a-kafka-streams-streaming-analytics-application/.

Getting Started

Start a new Node application, using npm init.

Into this application, install npm packages kafka-node en mongodb:

npm install mongodb –save

npm install kafka-node –save

This installs the two Node modules with their dependencies and adds them to the package.json

First Node Program – for Creating and Updating Two Static Documents

This simple Node.JS program uses the the mongodb driver for Node, connects to a MongoDB server running locally and a database called test. It then tries to update two documents in the top3 collection in the test database; if a document does not yet exist (based on the key which is the continent property) it is created. When the application is done running, two documents exist (and have their lastModified property set if they were updated).

var MongoClient = require('mongodb').MongoClient;
var assert = require('assert');

// connect string for mongodb server running locally, connecting to a database called test
var url = 'mongodb://127.0.0.1:27017/test';

MongoClient.connect(url, function(err, db) {
  assert.equal(null, err);
  console.log("Connected correctly to server.");
   var doc = {
        "continent" : "Europe",
         "nrs" : [ {"name":"Belgium"}, {"name":"Luxemburg"}]
      };
   var doc2 = {
        "continent" : "Asia",
         "nrs" : [ {"name":"China"}, {"name":"India"}]
      };
  insertDocument(db,doc, function() {
    console.log("returned from processing doc "+doc.continent);
    insertDocument(db,doc2, function() {
      console.log("returned from processing doc "+doc2.continent);
      db.close();
      console.log("Connection to database is closed. Two documents should exist, either just created or updated. ");
      console.log("From the MongoDB shell: db.top3.find() should list the documents. ");
    });
  });
});

var insertDocument = function(db, doc, callback) {
   // first try to update; if a document could be updated, we're done
   console.log("Processing doc for "+doc.continent);
   updateTop3ForContinent( db, doc, function (results) {
       if (!results || results.result.n == 0) {
          // the document was not updated so presumably it does not exist; let's insert it
          db.collection('top3').insertOne(
                doc
              , function(err, result) {
                   assert.equal(err, null);
                   callback();
                }
              );
       }//if
       else {
         callback();
       }
 }); //updateTop3ForContinent
}; //insertDocument

var updateTop3ForContinent = function(db, top3 , callback) {
   db.collection('top3').updateOne(
      { "continent" : top3.continent },
      {
        $set: { "nrs": top3.nrs },
        $currentDate: { "lastModified": true }
      }, function(err, results) {
      //console.log(results);
      callback(results);
   });
};

The console output from the Node application:

The output on the MongoDB Shell:

Note: I have used db.top3.find() three times: before running the Node application, after it has ran once and after it has ran a second time. Note that after the second time, the lastModified property was added.

Second Node Program – Consume messages from Kafka Topic and Update MongoDB accordingly

This application registers as Kafka Consumer on the Topic Top3CountrySizePerContinent. Each message that is produced to that topic is consumed by the Node application and handled by function handleCountryMessage. This function parses the JSON message received from Kafka, adds a property continent derived from the key of the Kafka message, and calls the insertDocument function. This function attempts to update a record in the MongoDB collection top3 that has the same continent property value as the document passed in as parameter. If the update succeeds, the handling of the Kafka message is complete and the MongoDB collection contains the most recent standings produced by the Kafka Streams application. If the update fails, presumably that happens because there is no record yet for the current continent. In that case, a new document is inserted for the continent.

/*
This program connects to MongoDB (using the mongodb module )
This program consumes Kafka messages from topic Top3CountrySizePerContinent to which the Running Top3 (size of countries by continent) is produced.

This program records each latest update of the top 3 largest countries for a continent in MongoDB. If a document does not yet exist for a continent (based on the key which is the continent property) it is inserted.

The program ensures that the MongoDB /test/top3 collection contains the latest Top 3 for each continent at any point in time.

*/

var MongoClient = require('mongodb').MongoClient;
var assert = require('assert');

var kafka = require('kafka-node')
var Consumer = kafka.Consumer
var client = new kafka.Client("ubuntu:2181/")
var countriesTopic = "Top3CountrySizePerContinent";


// connect string for mongodb server running locally, connecting to a database called test
var url = 'mongodb://127.0.0.1:27017/test';
var mongodb;

MongoClient.connect(url, function(err, db) {
  assert.equal(null, err);
  console.log("Connected correctly to MongoDB server.");
  mongodb = db;
});

var insertDocument = function(db, doc, callback) {
   // first try to update; if a document could be updated, we're done
   updateTop3ForContinent( db, doc, function (results) {
       if (!results || results.result.n == 0) {
          // the document was not updated so presumably it does not exist; let's insert it
          db.collection('top3').insertOne(
                doc
              , function(err, result) {
                   assert.equal(err, null);
                   console.log("Inserted doc for "+doc.continent);
                   callback();
                }
              );
       }//if
       else {
         console.log("Updated doc for "+doc.continent);
         callback();
       }
 }); //updateTop3ForContinent
}; //insertDocument

var updateTop3ForContinent = function(db, top3 , callback) {
   db.collection('top3').updateOne(
      { "continent" : top3.continent },
      {
        $set: { "nrs": top3.nrs },
        $currentDate: { "lastModified": true }
      }, function(err, results) {
      //console.log(results);
      callback(results);
   });
};

// Configure Kafka Consumer for Kafka Top3 Topic and handle Kafka message (by calling updateSseClients)
var consumer = new Consumer(
  client,
  [],
  {fromOffset: true}
);

consumer.on('message', function (message) {
  handleCountryMessage(message);
});

consumer.addTopics([
  { topic: countriesTopic, partition: 0, offset: 0}
], () => console.log("topic "+countriesTopic+" added to consumer for listening"));

function handleCountryMessage(countryMessage) {
    var top3 = JSON.parse(countryMessage.value);
    var continent = new Buffer(countryMessage.key).toString('ascii');
    top3.continent = continent;
    // insert or update the top3 in the MongoDB server
    insertDocument(mongodb,top3, function() {
      console.log("Top3 recorded in MongoDB for "+top3.continent);
    });

}// handleCountryMessage

Running the application produces the following output.

Producing Countries:

Producing Streaming Analysis – Running Top 3 per Continent:

Processing Kafka Messages:

Resulting MongoDB collection:

And after a little while, here is the latest situation for Europe and Asia in the MongoDB collection :

Resulting from processing the latest Kafka Stream result messages:

Querying the MongoDB Collection

The current set of top3 documents – one for each continent – stored in MongoDB can be queried, using MongoDB find and aggregation facilities.

One query we can perform is to retrieve the top 5 largest countries in the world. Here is the query that gives us that insight. First it creates a single record per country (using unwind to join the nrs collection in each top3 document). The countries are then sorted by the size of each country (descending) and the first 5 of the sort result are retained. These five are then projected into a nicer looking output document that only contains continent, country and area fields.

db.top3.aggregate([ {$project: {nrs:1}},{$unwind:’$nrs’}, {$sort: {“nrs.size”:-1}}, {$limit:5}, {$project: {continent:’$nrs.continent’, country:’$nrs.name’, area:’$nrs.size’ }}])

db.top3.aggregate([
   {$project: {nrs:1}}
  ,{$unwind:'$nrs'}
  , {$sort: {"nrs.size":-1}}
  , {$limit:5}
  , {$project: {continent:'$nrs.continent', country:'$nrs.name', area:'$nrs.size' }}
])

(And because no continent has its number 3 country in the top 4 of this list, we can be sure that this top 5 is the actual top 5 of the world)

Resources

A very good read – although a little out of date – is this tutorial on 1st and 2nd steps with Node and Mongodb: http://cwbuecheler.com/web/tutorials/2013/node-express-mongo/

MongoDB Driver for Node.js in the official MongoDB documentation: https://docs.mongodb.com/getting-started/node/client/

Kafka Connect for MongoDB – YouTube intro – https://www.youtube.com/watch?v=AF9WyW4npwY

Combining MongoDB and Apache Kafka – with a Java application talking and listening to both: https://www.mongodb.com/blog/post/mongodb-and-data-streaming-implementing-a-mongodb-kafka-consumer

Tutorials Point MongoDB tutorials – https://www.tutorialspoint.com/mongodb/mongodb_sort_record.htm

Data Aggregation with Node.JS driver for MongoDB – https://docs.mongodb.com/getting-started/node/aggregation/

The post Node.js application writing to MongoDB – Kafka Streams findings read from Kafka Topic written to MongoDB from Node appeared first on AMIS Oracle and Java Blog.

↧

DIY Parallelization with Oracle DBMS_DATAPUMP

February 23, 2017, 10:17 am

≫ Next: Dump Oracle data into a delimited ascii file with PL/SQL

≪ Previous: Node.js application writing to MongoDB – Kafka Streams findings read from Kafka Topic written to MongoDB from Node

Oracle dbms_datapump provides a parallel option for exports and imports, but some objects cannot be processed in this mode. In a migration project from AIX 11gR2 to ODA X5-2 ( OL 5.9 ) 12c that included an initial load for Golden Gate, I had to deal with one of those objects, a 600G table with LOB fields, stored in the database as Basic Files ( = traditional LOB storage ).

By applying some DIY parallelization I was able to bring the export time back from 14 hours to 35 minutes.
Instrumental in this solution is the handy “detach” feature in the dbms_datapump package, and the use of dbms_rowid to “split” the table data in same sized chunks. The first allowed me to just define and start datapump jobs without having to wait till each one is finished, the second results in all jobs to end within just a short time of each other.

The following PL/SQL exports tables in 32 chunks with 32 concurrent datapump jobs. Feel free to adjust this “dop” and
schema as well as table names. Just one parameter is provided… it makes the export procedure as a whole wait
for the end of all exports, so some other action may start automatically ( e.g. a file transfer ).

CREATE OR REPLACE PACKAGE Datapump_Parallel_Exp_Pck
  IS
    g_parallel   CONSTANT NUMBER       := 32;
    g_dmp_dir    CONSTANT VARCHAR2(25) := 'DATA_PUMP_DIR';

-------------------------------------------------------------------------------------------------
PROCEDURE Exec_Export
   ( P_wait IN PLS_INTEGER := 0 );

--------------------------------------------------------------------------------------------------
END Datapump_Parallel_Exp_Pck;
/

SHOW ERRORS;


CREATE OR REPLACE PACKAGE BODY Datapump_Parallel_Exp_Pck
  IS

-------------------------------------------------------------------------------------------------
PROCEDURE Sleep
  (P_millisesconds IN NUMBER)
 AS LANGUAGE JAVA
    NAME 'java.lang.Thread.sleep(int)';

-------------------------------------------------------------------------------------------------
FUNCTION Get_Current_Scn
  RETURN NUMBER
    IS
    v_ret NUMBER := 0;
BEGIN

  SELECT current_scn
    INTO v_ret
  FROM v$database;

  RETURN v_ret;

  EXCEPTION
    WHEN OTHERS THEN
   RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
END Get_Current_Scn;

-------------------------------------------------------------------------------------------------
PROCEDURE Exp_Tables_Parallel
  ( P_scn  IN NUMBER
  , P_dmp OUT VARCHAR2 )
 IS
   h1                  NUMBER(10);
   v_dop               NUMBER := g_parallel;
   v_curr_scn          NUMBER := P_scn;
   v_job_name_org      VARCHAR2(30)  := 'PX_'||TO_CHAR(sysdate,'YYYYMMDDHH24MISS');    -- PX: Parallel Execution
   v_job_name          VARCHAR2(30)  := v_job_name_org;
   v_dmp_file_name_org VARCHAR2(100) := lower(v_job_name||'.dmp');
   v_dmp_file_name     VARCHAR2(100) := v_dmp_file_name_org;
   v_log_file_name_org VARCHAR2(100) := lower(v_job_name||'.log');
   v_log_file_name     VARCHAR2(100) := v_log_file_name_org;

BEGIN

-- drop master table for "orphaned job" if it exists
   for i in ( select 'DROP TABLE '||owner_name||'.'||job_name||' PURGE' stat
              from dba_datapump_jobs
              where owner_name = USER
                and instr(v_job_name, upper(job_name) ) > 0
                and state = 'NOT RUNNING'
                and attached_sessions = 0 )
   loop
     execute immediate i.stat;
   end loop;

-- set out parameter
  P_dmp := v_dmp_file_name;

-- start jobs in parallel
  DBMS_OUTPUT.PUT_LINE('**** START SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' );
  for counter in 0 .. v_dop-1
  loop
    v_job_name      := v_job_name_org||'_'||lpad(counter+1,3,0);
    v_dmp_file_name := v_dmp_file_name_org||'_'||lpad(counter+1,3,0);
    v_log_file_name := v_log_file_name_org||'_'||lpad(counter+1,3,0);

    h1 := dbms_datapump.open
      ( operation => 'EXPORT'
      , job_mode  => 'SCHEMA'
      , job_name  => v_job_name
      , version   => 'LATEST');
   DBMS_OUTPUT.PUT_LINE( 'Successfully opened job: '||v_job_name);

     dbms_datapump.set_parallel(handle  => h1, degree => 1);
     dbms_datapump.set_parameter(handle => h1, name  => 'KEEP_MASTER', value => 0);
     dbms_datapump.set_parameter(handle => h1, name  => 'ESTIMATE', value => 'BLOCKS');
     dbms_datapump.set_parameter(handle => h1, name  => 'INCLUDE_METADATA', value => 0);
     dbms_datapump.set_parameter(handle => h1, name  => 'METRICS', value => 1);
     dbms_datapump.set_parameter(handle => h1, name  => 'FLASHBACK_SCN', value => v_curr_scn);
   DBMS_OUTPUT.PUT_LINE('Successfully set job parameters for job '||v_job_name);

-- export just these schemas
     dbms_datapump.metadata_filter(handle => h1, name => 'SCHEMA_LIST', value => ' ''<SCHEMA01>'',''<SCHEMA02>'',''<SCHEMA03>'' ');
   DBMS_OUTPUT.PUT_LINE('Successfully set schemas for job '||v_job_name);
-- export tables only
     dbms_datapump.metadata_filter(handle => h1, name => 'INCLUDE_PATH_EXPR', value => q'[='TABLE']' );
   DBMS_OUTPUT.PUT_LINE('Successfully set table export for job '||v_job_name);
-- export just these tables
     dbms_datapump.metadata_filter(handle => h1, name => 'NAME_LIST', value => ' ''<TABLE01>'',''<TABLE02>'',''<TABLE03>'',''<TABLE04>'',''<TABLE05>'' ', object_path => 'TABLE');
   DBMS_OUTPUT.PUT_LINE('Successfully set tables for job '||v_job_name);
-- export just a 1/v_dop part of the data
     dbms_datapump.data_filter(handle => h1, name => 'SUBQUERY', value => 'WHERE MOD(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID), '||v_dop||')='||counter);
   DBMS_OUTPUT.PUT_LINE('Successfully set data filter for job '||v_job_name);

     dbms_datapump.add_file
       ( handle => h1
       , filename => v_dmp_file_name
       , directory => g_dmp_dir
       , filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_DUMP_FILE
       , reusefile => 1 );
   DBMS_OUTPUT.PUT_LINE('Successfully add dmp file: '||v_dmp_file_name);

     dbms_datapump.add_file
       ( handle => h1
       , filename => v_log_file_name
       , directory => g_dmp_dir
       , filetype => DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE);
   DBMS_OUTPUT.PUT_LINE('Successfully add log file: '||v_log_file_name );

     dbms_datapump.log_entry(handle => h1, message => 'Job '||(counter+1)||'/'||v_dop||' starting at '||to_char(sysdate, 'dd-mon-yyyy hh24:mi:ss')||' as of scn: '||v_curr_scn );
     dbms_datapump.start_job(handle => h1, skip_current => 0, abort_step => 0);
   DBMS_OUTPUT.PUT_LINE('Successfully started job '||(counter+1)||'/'||v_dop||' at '||to_char(sysdate,'dd-mon-yyyy hh24:mi:ss') ||' as of scn: '||v_curr_scn );

     dbms_datapump.detach(handle => h1);
   DBMS_OUTPUT.PUT_LINE('Successfully detached from job' );

  end loop;
  DBMS_OUTPUT.PUT_LINE('**** END SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' );

EXCEPTION
  WHEN OTHERS THEN
    dbms_datapump.detach(handle => h1);
    DBMS_OUTPUT.PUT_LINE('Successfully detached from job' );
    DBMS_OUTPUT.PUT_LINE('Error: '||SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
    DBMS_OUTPUT.PUT_LINE('**** END SETTING DATAPUMP PARALLEL_TABLE_EXPORT JOBS ****' );
    RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
END Exp_Tables_Parallel;

-------------------------------------------------------------------------------------------------
PROCEDURE Exec_Export
   ( P_wait IN PLS_INTEGER := 0 )
  IS
  v_scn         NUMBER;
  v_dmp         VARCHAR2(200);
  export_done   PLS_INTEGER := 0;

BEGIN

-- get current scn
  v_scn := Get_Current_Scn;

-- start parallel export processes + detach
  Exp_Tables_Parallel( v_scn, v_dmp );

  if P_wait = 1 then
-- wait till all parallel export processes are finished
-- check every 5 minutes
    export_done := 0;
    loop
      for i in ( select 1
                 from ( select count(*) cnt
                        from user_tables
                        where instr(table_name,upper(replace(v_dmp,'.dmp'))) > 0 )
                 where cnt = 0 )
      loop
        export_done := 1;
      end loop;

      if export_done = 1 then
        exit;
      end if;
      Sleep(300000);
    end loop;
  end if;

EXCEPTION
  WHEN OTHERS THEN
    DBMS_OUTPUT.PUT_LINE('Error: '||SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
    RAISE_APPLICATION_ERROR( -20010, SQLERRM||' - '||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE );
END Exec_Export;

--------------------------------------------------------------------------------------------------------
END Datapump_Parallel_Exp_Pck;
/

SHOW ERRORS;

The post DIY Parallelization with Oracle DBMS_DATAPUMP appeared first on AMIS Oracle and Java Blog.

↧

Dump Oracle data into a delimited ascii file with PL/SQL

February 24, 2017, 6:30 am

≫ Next: Oracle Service Bus : disable / enable a proxy service via WebLogic Server MBeans with JMX

≪ Previous: DIY Parallelization with Oracle DBMS_DATAPUMP

This is how I dump data from an Oracle Database (tested on 8i,9i,10g,11g,12c) to a delimited ascii file:

SQL*Plus: Release 12.1.0.2.0 Production on Fri Feb 24 13:55:47 2017
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production

SQL> set timing on
SQL> select Dump_Delimited('select * from all_objects', 'all_objects.csv') nr_rows from dual;

   NR_ROWS
----------
     97116

Elapsed: 00:00:11.87
SQL> ! cat /u01/etl/report/all_objects_readme.txt


  *********************************************************************
  Record Layout of file /u01/etl/report/all_objects.csv
  *********************************************************************


  Column                          Sequence  MaxLength  Datatype
  ------------------------------  --------  ---------  ----------

  OWNER                           1         128        VARCHAR2
  OBJECT_NAME                     2         128        VARCHAR2
  SUBOBJECT_NAME                  3         128        VARCHAR2
  OBJECT_ID                       4         24         NUMBER
  DATA_OBJECT_ID                  5         24         NUMBER
  OBJECT_TYPE                     6         23         VARCHAR2
  CREATED                         7         20         DATE
  LAST_DDL_TIME                   8         20         DATE
  TIMESTAMP                       9         19         VARCHAR2
  STATUS                          10        7          VARCHAR2
  TEMPORARY                       11        1          VARCHAR2
  GENERATED                       12        1          VARCHAR2
  SECONDARY                       13        1          VARCHAR2
  NAMESPACE                       14        24         NUMBER
  EDITION_NAME                    15        128        VARCHAR2
  SHARING                         16        13         VARCHAR2
  EDITIONABLE                     17        1          VARCHAR2
  ORACLE_MAINTAINED               18        1          VARCHAR2


  ----------------------------------
  Generated:     24-02-2017 13:56:50
  Generated by:  ETL
  Columns Count: 18
  Records Count: 97116
  Delimiter: ][
  Row Delimiter: ]
  ----------------------------------

SQL>

Next to the query and the generated filename the Dump_Delimited function takes another 6 parameters, each one with a default value. Check out the PL/SQL, and BTW… the basics for this code comes from Tom Kyte.

SET DEFINE OFF;
CREATE OR REPLACE DIRECTORY ETL_UNLOAD_DIR AS '/u01/etl/report';
GRANT READ, WRITE ON DIRECTORY ETL_UNLOAD_DIR TO ETL;

CREATE OR REPLACE FUNCTION Dump_Delimited
   ( P_query                IN VARCHAR2
   , P_filename             IN VARCHAR2
   , P_column_delimiter     IN VARCHAR2    := ']['
   , P_row_delimiter        IN VARCHAR2    := ']'
   , P_comment              IN VARCHAR2    := NULL
   , P_write_rec_layout     IN PLS_INTEGER := 1
   , P_dir                  IN VARCHAR2    := 'ETL_UNLOAD_DIR'
   , P_nr_is_pos_integer    IN PLS_INTEGER := 0 )
RETURN PLS_INTEGER
 IS
    filehandle             UTL_FILE.FILE_TYPE;
    filehandle_rc          UTL_FILE.FILE_TYPE;

    v_user_name            VARCHAR2(100);
    v_file_name_full       VARCHAR2(200);
    v_dir                  VARCHAR2(200);
    v_total_length         PLS_INTEGER := 0;
    v_startpos             PLS_INTEGER := 0;
    v_datatype             VARCHAR2(30);
    v_delimiter            VARCHAR2(10):= P_column_delimiter;
    v_rowdelimiter         VARCHAR2(10):= P_row_delimiter;

    v_cursorid             PLS_INTEGER := DBMS_SQL.OPEN_CURSOR;
    v_columnvalue          VARCHAR2(4000);
    v_ignore               PLS_INTEGER;
    v_colcount             PLS_INTEGER := 0;
    v_newline              VARCHAR2(32676);
    v_desc_cols_table      DBMS_SQL.DESC_TAB;
    v_dateformat           NLS_SESSION_PARAMETERS.VALUE%TYPE;
    v_stat                 VARCHAR2(1000);
    counter                PLS_INTEGER := 0;
BEGIN

    SELECT directory_path
      INTO v_dir
    FROM DBA_DIRECTORIES
    WHERE directory_name = P_dir;
    v_file_name_full  := v_dir||'/'||P_filename;

    SELECT VALUE
      INTO v_dateformat
    FROM NLS_SESSION_PARAMETERS
    WHERE parameter = 'NLS_DATE_FORMAT';

    /* Use a date format that includes the time. */
    v_stat := 'alter session set nls_date_format=''dd-mm-yyyy hh24:mi:ss'' ';
    EXECUTE IMMEDIATE v_stat;

    filehandle := UTL_FILE.FOPEN( P_dir, P_filename, 'w', 32000 );

    /* Parse the input query so we can describe it. */
    DBMS_SQL.PARSE(  v_cursorid,  P_query, dbms_sql.native );

    /* Now, describe the outputs of the query. */
    DBMS_SQL.DESCRIBE_COLUMNS( v_cursorid, v_colcount, v_desc_cols_table );

    /* For each column, we need to define it, to tell the database
     * what we will fetch into. In this case, all data is going
     * to be fetched into a single varchar2(4000) variable.
     *
     * We will also adjust the max width of each column.
     */
IF P_write_rec_layout = 1 THEN

   filehandle_rc := UTL_FILE.FOPEN(P_dir, SUBSTR(P_filename,1, INSTR(P_filename,'.',-1)-1)||'_readme.txt', 'w');

--Start Header
    v_newline := CHR(10)||CHR(10)||'  *********************************************************************  ';
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Record Layout of file '||v_file_name_full;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
    v_newline := '  *********************************************************************  '||CHR(10)||CHR(10);
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
    v_newline := '  Column                          Sequence  MaxLength  Datatype  ';
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
    v_newline := '  ------------------------------  --------  ---------  ----------  '||CHR(10);
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
--End Header

--Start Body
    FOR i IN 1 .. v_colcount
    LOOP
       DBMS_SQL.DEFINE_COLUMN( v_cursorid, i, v_columnvalue, 4000 );
       SELECT DECODE( v_desc_cols_table(i).col_type,  2, DECODE(v_desc_cols_table(i).col_precision,0,v_desc_cols_table(i).col_max_len,v_desc_cols_table(i).col_precision)+DECODE(P_nr_is_pos_integer,1,0,2)
                                                   , 12, 20, v_desc_cols_table(i).col_max_len )
         INTO v_desc_cols_table(i).col_max_len
       FROM dual;

       SELECT DECODE( TO_CHAR(v_desc_cols_table(i).col_type), '1'  , 'VARCHAR2'
                                                            , '2'  , 'NUMBER'
                                                            , '8'  , 'LONG'
                                                            , '11' , 'ROWID'
                                                            , '12' , 'DATE'
                                                            , '96' , 'CHAR'
                                                            , '108', 'USER_DEFINED_TYPE', TO_CHAR(v_desc_cols_table(i).col_type) )
         INTO v_datatype
       FROM DUAL;

       v_newline := RPAD('  '||v_desc_cols_table(i).col_name,34)||RPAD(i,10)||RPAD(v_desc_cols_table(i).col_max_len,11)||RPAD(v_datatype,25);
    UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
    END LOOP;
--End Body

ELSE

    FOR i IN 1 .. v_colcount LOOP
       DBMS_SQL.DEFINE_COLUMN( v_cursorid, i, v_columnvalue, 4000 );
       SELECT DECODE( v_desc_cols_table(i).col_type,  2, DECODE(v_desc_cols_table(i).col_precision,0,v_desc_cols_table(i).col_max_len,v_desc_cols_table(i).col_precision)+DECODE(P_nr_is_pos_integer,1,0,2)
                                                   , 12, 20, v_desc_cols_table(i).col_max_len )
         INTO v_desc_cols_table(i).col_max_len
       FROM dual;
     END LOOP;

END IF;

    v_ignore := DBMS_SQL.EXECUTE(v_cursorid);

     WHILE ( DBMS_SQL.FETCH_ROWS(v_cursorid) > 0 )
     LOOP
        /* Build up a big output line. This is more efficient than
         * calling UTL_FILE.PUT inside the loop.
         */
        v_newline := NULL;
        FOR i IN 1 .. v_colcount LOOP
            DBMS_SQL.COLUMN_VALUE( v_cursorid, i, v_columnvalue );
            if i = 1 then
              v_newline := v_newline||v_columnvalue;
            else
              v_newline := v_newline||v_delimiter||v_columnvalue;
            end if;
        END LOOP;

        /* Now print out that line and increment a counter. */
        UTL_FILE.PUT_LINE( filehandle, v_newline||v_rowdelimiter );
        counter := counter+1;
    END LOOP;

IF P_write_rec_layout = 1 THEN

--Start Footer
    v_newline := CHR(10)||CHR(10)||'  ----------------------------------  ';
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Generated:     '||SYSDATE;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Generated by:  '||USER;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Columns Count: '||v_colcount;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Records Count: '||counter;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Delimiter: '||v_delimiter;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
       v_newline := '  Row Delimiter: '||v_rowdelimiter;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
    v_newline := '  ----------------------------------  '||CHR(10)||CHR(10);
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
--End Footer

--Start Commment
    v_newline := '  '||P_comment;
      UTL_FILE.PUT_LINE(filehandle_rc, v_newline);
--End Commment

UTL_FILE.FCLOSE(filehandle_rc);

END IF;

    /* Free up resources. */
    DBMS_SQL.CLOSE_CURSOR(v_cursorid);
    UTL_FILE.FCLOSE( filehandle );

    /* Reset the date format ... and return. */
    v_stat := 'alter session set nls_date_format=''' || v_dateformat || ''' ';
    EXECUTE IMMEDIATE v_stat;

    RETURN counter;
EXCEPTION
    WHEN OTHERS THEN
        DBMS_SQL.CLOSE_CURSOR( v_cursorid );
        EXECUTE IMMEDIATE 'alter session set nls_date_format=''' || v_dateformat || ''' ';
        RETURN counter;

END Dump_Delimited;
/

SHOW ERRORS;

The post Dump Oracle data into a delimited ascii file with PL/SQL appeared first on AMIS Oracle and Java Blog.

↧

Oracle Service Bus : disable / enable a proxy service via WebLogic Server MBeans with JMX

February 28, 2017, 2:22 am

≫ Next: Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring

≪ Previous: Dump Oracle data into a delimited ascii file with PL/SQL

At a public sector organization in the Netherlands an OSB proxy service was (via JMS) reading messages from a WebLogic queue. These messages where then send to a back-end system. Every evening during a certain time period the back-end system was down. So therefor and also in case of planned maintenance there was a requirement whereby it was necessary to be able to stop and start sending messages to the back-end system from the queue. Hence, a script was needed to disable/enable the OSB proxy service (deployed on OSB 11.1.1.7).

This article will explain how the OSB proxy service can be disabled/enabled via WebLogic Server MBeans with JMX.

A managed bean (MBean) is a Java object that represents a Java Management Extensions (JMX) manageable resource in a distributed environment, such as an application, a service, a component, or a device.

First an “high over” overview of the MBeans is given. For further information see “Fusion Middleware Developing Custom Management Utilities With JMX for Oracle WebLogic Server”, via url: https://docs.oracle.com/cd/E28280_01/web.1111/e13728/toc.htm

Next the structure and use of the System MBean Browser in the Oracle Enterprise Manager Fusion Middleware Control is discussed.

Finally the code to disable/enable the OSB proxy service is shown.

To disable/enable an OSB proxy service, also WebLogic Scripting Tool (WLST) can be used, but in this case (also because of my java developer skills) JMX was used. For more information have a look for example at AMIS TECHNOLOGY BLOG: “Oracle Service Bus: enable / disable proxy service with WLST”, via url: https://technology.amis.nl/2011/01/10/oracle-service-bus-enable-disable-proxy-service-with-wlst/

The Java Management Extensions (JMX) technology is a standard part of the Java Platform, Standard Edition (Java SE platform). The JMX technology was added to the platform in the Java 2 Platform, Standard Edition (J2SE) 5.0 release.

The JMX technology provides a simple, standard way of managing resources such as applications, devices, and services. Because the JMX technology is dynamic, you can use it to monitor and manage resources as they are created, installed and implemented. You can also use the JMX technology to monitor and manage the Java Virtual Machine (Java VM).

For another example of using MBeans with JMX, I kindly point you to another article (written by me) on the AMIS TECHNOLOGY BLOG: “Doing performance measurements of an OSB Proxy Service by programmatically extracting performance metrics via the ServiceDomainMBean and presenting them as an image via a PowerPoint VBA module”, via url: https://technology.amis.nl/2016/01/30/performance-measurements-of-an-osb-proxy-service-by-using-the-servicedomainmbean/

Basic Organization of a WebLogic Server Domain

As you probably already know a WebLogic Server administration domain is a collection of one or more servers and the applications and resources that are configured to run on the servers. Each domain must include a special server instance that is designated as the Administration Server. The simplest domain contains a single server instance that acts as both Administration Server and host for applications and resources. This domain configuration is commonly used in development environments. Domains for production environments usually contain multiple server instances (Managed Servers) running independently or in groups called clusters. In such environments, the Administration Server does not host production applications.

Separate MBean Types for Monitoring and Configuring

All WebLogic Server MBeans can be organized into one of the following general types based on whether the MBean monitors or configures servers and resources:

Runtime MBeans contain information about the run-time state of a server and its resources. They generally contain only data about the current state of a server or resource, and they do not persist this data. When you shut down a server instance, all run-time statistics and metrics from the run-time MBeans are destroyed.
Configuration MBeans contain information about the configuration of servers and resources. They represent the information that is stored in the domain’s XML configuration documents.
Configuration MBeans for system modules contain information about the configuration of services such as JDBC data sources and JMS topics that have been targeted at the system level. Instead of targeting these services at the system level, you can include services as modules within an application. These application-level resources share the life cycle and scope of the parent application. However, WebLogic Server does not provide MBeans for application modules.

MBean Servers

At the core of any JMX agent is the MBean server, which acts as a container for MBeans.

The JVM for an Administration Server maintains three MBean servers provided by Oracle and optionally maintains the platform MBean server, which is provided by the JDK itself. The JVM for a Managed Server maintains only one Oracle MBean server and the optional platform MBean server.

MBean Server	Creates, registers, and provides access to…
Domain Runtime MBean Server	MBeans for domain-wide services. This MBean server also acts as a single point of access for MBeans that reside on Managed Servers. Only the Administration Server hosts an instance of this MBean server.
Runtime MBean Server	MBeans that expose monitoring, run-time control, and the active configuration of a specific WebLogic Server instance. In release 11.1.1.7, the WebLogic Server Runtime MBean Server is configured by default to be the platform MBean server. Each server in the domain hosts an instance of this MBean server.
Edit MBean Server	Pending configuration MBeans and operations that control the configuration of a WebLogic Server domain. It exposes a ConfigurationManagerMBean for locking, saving, and activating changes. Only the Administration Server hosts an instance of this MBean server.
The JVM’s platform MBean server	MBeans provided by the JDK that contain monitoring information for the JVM itself. You can register custom MBeans in this MBean server. In release 11.1.1.7, WebLogic Server uses the JVM’s platform MBean server to contain the WebLogic run-time MBeans by default.

Service MBeans

Within each MBean server, WebLogic Server registers a service MBean under a simple object name. The attributes and operations in this MBean serve as your entry point into the WebLogic Server MBean hierarchies and enable JMX clients to navigate to all WebLogic Server MBeans in an MBean server after supplying only a single object name.

MBean Server	Service MBean	JMX object name
The Domain Runtime MBean Server	DomainRuntimeServiceMBean Provides access to MBeans for domain-wide services such as application deployment, JMS servers, and JDBC data sources. It also is a single point for accessing the hierarchies of all run-time MBeans and all active configuration MBeans for all servers in the domain.	com.bea:Name=DomainRuntimeService,Type=weblogic.management.mbeanservers.domainruntime.DomainRuntimeServiceMBean
Runtime MBean Servers	RuntimeServiceMBean Provides access to run-time MBeans and active configuration MBeans for the current server.	com.bea:Name=RuntimeService,Type=weblogic.management.mbeanservers.runtime.RuntimeServiceMBean
The Edit MBean Server	EditServiceMBean Provides the entry point for managing the configuration of the current WebLogic Server domain.	com.bea:Name=EditService,Type=weblogic.management.mbeanservers.edit.EditServiceMBean

Choosing an MBean Server

If your client monitors run-time MBeans for multiple servers, or if your client runs in a separate JVM, Oracle recommends that you connect to the Domain Runtime MBean Server on the Administration Server instead of connecting separately to each Runtime MBean Server on each server instance in the domain.

The trade off for directing all JMX requests through the Domain Runtime MBean Server is a slight degradation in performance due to network latency and increased memory usage. However, for most network topologies and performance requirements, the simplified code maintenance and enhanced security that the Domain Runtime MBean Server enables is preferable.

System MBean Browser

Oracle Enterprise Manager Fusion Middleware Control provides the System MBean Browser for managing MBeans that perform specific monitoring and configuration tasks.

Via the Oracle Enterprise Manager Fusion Middleware Control for a certain domain, the System MBean Browser can be opened.

Here the previously mentioned types of MBean’s can be seen: Runtime MBeans and Configuration MBeans:

When navigating to “Configuration MBeans | com.bea”, the previously mentioned EditServiceMBean can be found:

When navigating to “Runtime MBeans | com.bea | Domain: <a domain>”, the previously mentioned DomainRuntimeServiceMBean can be found:

Also the later on in this article mentioned MBeans can be found:

For example for the ProxyServiceConfigurationMbean, the available operations can be found:

When navigating to “Runtime MBeans | com.bea”, within each Server the previously mentioned RuntimeServiceMBean can be found.

Code to disable/enable the OSB proxy service

The requirement to be able to stop and start sending messages to the back-end system from the queue was implemented by disabling/enabling the state of the OSB Proxy service JMSConsumerStuFZKNMessageService_PS.

Short before the back-end system goes down, dequeuing of the queue should be disabled.
Right after the back-end system goes up again, dequeuing of the queue should be enabled.

The state of the OSB Proxy service can be seen in the Oracle Service Bus Administration 11g Console (for example via the Project Explorer) in the tab “Operational Settings” of the proxy service.

For ease of use, two ms-dos batch files where created, each using MBeans, to change the state of a service (proxy service or business service). As stated before, the WebLogic Server contains a set of MBeans that can be used to configure, monitor and manage WebLogic Server resources.

Disable_JMSConsumerStuFZKNMessageService_PS.bat

On the server where the back-end system resides, the ms-dos batch file “Disable_JMSConsumerStuFZKNMessageService_PS.bat” is called.

The content of the batch file is:

java.exe -classpath “OSBServiceState.jar;com.bea.common.configfwk_1.7.0.0.jar;sb-kernel-api.jar;sb-kernel-impl.jar;wlfullclient.jar” nl.xyz.osbservice.osbservicestate.OSBServiceState “xyz” “7001” “weblogic” “xyz” “ProxyService” “JMSConsumerStuFZKNMessageService-1.0/proxy/JMSConsumerStuFZKNMessageService_PS” “Disable”

Enable_JMSConsumerStuFZKNMessageService_PS.bat

On the server where the back-end system resides, the ms-dos batch file “Enable_JMSConsumerStuFZKNMessageService_PS.bat” is called.

The content of the batch file is:

In both ms-dos batch files via java.exe a class named OSBServiceState is being called. The main method of this class expects the following parameters:

Parameter name	Description
HOSTNAME	Host name of the AdminServer
PORT	Port of the AdminServer
USERNAME	Username
PASSWORD	Passsword
SERVICETYPE	Type of resource. Possible values are: ProxyService BusinessService
SERVICEURI	Identifier of the resource. The name begins with the project name, followed by folder names and ending with the resource name.
ACTION	The action to be carried out. Possible values are: Enable Disable

Every change is carried out in it´s own session (via the SessionManagementMBean), which is automatically activated with description: OSBServiceState_script_<systemdatetime>

This can be seen via the Change Center | View Changes of the Oracle Service Bus Administration 11g Console:

The response from “Disable_JMSConsumerStuFZKNMessageService_PS.bat” is:

Disabling service JMSConsumerStuFZKNMessageService-1.0/proxy/JMSConsumerStuFZKNMessageService_PS has been succesfully completed

In the Oracle Service Bus Administration 11g Console this change can be found as a Task:

The result of changing the state of the OSB Proxy service can be checked in the Oracle Service Bus Administration 11g Console.

The same applies when using “Enable_JMSConsumerStuFZKNMessageService_PS.bat”.

In the sample code below the use of the following MBeans can be seen:

DomainRuntimeServiceMBean (weblogic.management.mbeanservers.domainruntime.DomainRuntimeServiceMBean.class part of <Middleware Home Directory>/wlserver_10.3/server/lib/wlfullclient.jar)

Provides a common access point for navigating to all runtime and configuration MBeans in the domain as well as to MBeans that provide domain-wide services (such as controlling and monitoring the life cycles of servers and message-driven EJBs and coordinating the migration of migratable services). _{[https://docs.oracle.com/middleware/1213/wls/WLAPI/weblogic/management/mbeanservers/domainruntime/DomainRuntimeServiceMBean.html]}

This library is not by default provided in a WebLogic install and must be build. The simple way of how to do this is described in
“Fusion Middleware Programming Stand-alone Clients for Oracle WebLogic Server, Using the WebLogic JarBuilder Tool”, which can be reached via url: https://docs.oracle.com/cd/E28280_01/web.1111/e13717/jarbuilder.htm#SACLT240.

SessionManagementMBean (com.bea.wli.sb.management.configuration.SessionManagementMBean.class part of <Middleware Home Directory>/Oracle_OSB1/lib/sb-kernel-api.jar)

Provides API to create, activate or discard sessions. _{[http://docs.oracle.com/cd/E13171_01/alsb/docs26/javadoc/com/bea/wli/sb/management/configuration/SessionManagementMBean.html]}

ProxyServiceConfigurationMBean (com.bea.wli.sb.management.configuration.ProxyServiceConfigurationMBean.class part of <Middleware Home Directory>/Oracle_OSB1/lib/sb-kernel-api.jar)

Provides API to enable/disable services and enable/disable monitoring for a proxy service. _{[https://docs.oracle.com/cd/E13171_01/alsb/docs26/javadoc/com/bea/wli/sb/management/configuration/ProxyServiceConfigurationMBean.html]}

BusinessServiceConfigurationMBean (com.bea.wli.sb.management.configuration.BusinessServiceConfigurationMBean.class part of <Middleware Home Directory>/Oracle_OSB1/lib/sb-kernel-api.jar)

Provides API for managing business services. _{[https://docs.oracle.com/cd/E13171_01/alsb/docs25/javadoc/com/bea/wli/sb/management/configuration/BusinessServiceConfigurationMBean.html]}

Once the connection to the DomainRuntimeServiceMBean is made, other MBeans can be found via the findService method.

Service findService(String name,
                    String type,
                    String location)

This method returns the Service on the specified Server or in the primary MBeanServer if the location is not specified.

In the code example below certain java fields are used. For reading purposes the field values are shown in the following table:

Field	Field value
DomainRuntimeServiceMBean.MBEANSERVER_JNDI_NAME	weblogic.management.mbeanservers.domainruntime
DomainRuntimeServiceMBean.OBJECT_NAME	com.bea:Name=DomainRuntimeService,Type=weblogic.management.mbeanservers.domainruntime.DomainRuntimeServiceMBean
SessionManagementMBean.NAME	SessionManagement
SessionManagementMBean.TYPE	com.bea.wli.sb.management.configuration.SessionManagementMBean
ProxyServiceConfigurationMBean.NAME	ProxyServiceConfiguration
ProxyServiceConfigurationMBean.TYPE	com.bea.wli.sb.management.configuration.ProxyServiceConfigurationMBean
BusinessServiceConfigurationMBean.NAME	BusinessServiceConfiguration
BusinessServiceConfigurationMBean.TYPE	com.bea.wli.sb.management.configuration.BusinessServiceConfigurationMBean

Because of the use of com.bea.wli.config.Ref.class , the following library <Middleware Home Directory>/Oracle_OSB1/modules/com.bea.common.configfwk_1.7.0.0.jar was needed.

Because of the use of weblogic.management.jmx.MBeanServerInvocationHandler.class , the following library <Middleware Home Directory>/wlserver_10.3/server/lib/wlfullclient.jar was needed.

When running the code the following error was thrown:

java.lang.RuntimeException: java.lang.ClassNotFoundException: com.bea.wli.sb.management.configuration.DelegatedSessionManagementMBean
	at weblogic.management.jmx.MBeanServerInvocationHandler.newProxyInstance(MBeanServerInvocationHandler.java:621)
	at weblogic.management.jmx.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:418)
	at $Proxy0.findService(Unknown Source)
	at nl.xyz.osbservice.osbservicestate.OSBServiceState.<init>(OSBServiceState.java:66)
	at nl.xyz.osbservice.osbservicestate.OSBServiceState.main(OSBServiceState.java:217)
Caused by: java.lang.ClassNotFoundException: com.bea.wli.sb.management.configuration.DelegatedSessionManagementMBean
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
	at weblogic.management.jmx.MBeanServerInvocationHandler.newProxyInstance(MBeanServerInvocationHandler.java:619)
	... 4 more
Process exited.

So because of the use of com.bea.wli.sb.management.configuration.DelegatedSessionManagementMBean.class the following library <Middleware Home Directory>/Oracle_OSB1/lib/sb-kernel-impl.jar was also needed.

The java code:

package nl.xyz.osbservice.osbservicestate;


import com.bea.wli.config.Ref;
import com.bea.wli.sb.management.configuration.BusinessServiceConfigurationMBean;
import com.bea.wli.sb.management.configuration.ProxyServiceConfigurationMBean;
import com.bea.wli.sb.management.configuration.SessionManagementMBean;

import java.io.IOException;

import java.net.MalformedURLException;

import java.util.HashMap;
import java.util.Hashtable;
import java.util.Properties;

import javax.management.MBeanServerConnection;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;
import javax.management.remote.JMXConnector;
import javax.management.remote.JMXConnectorFactory;
import javax.management.remote.JMXServiceURL;

import javax.naming.Context;

import weblogic.management.jmx.MBeanServerInvocationHandler;
import weblogic.management.mbeanservers.domainruntime.DomainRuntimeServiceMBean;


public class OSBServiceState {
    private static MBeanServerConnection connection;
    private static JMXConnector connector;

    public OSBServiceState(HashMap props) {
        super();
        SessionManagementMBean sessionManagementMBean = null;
        String sessionName =
            "OSBServiceState_script_" + System.currentTimeMillis();
        String servicetype;
        String serviceURI;
        String action;
        String description = "";


        try {

            Properties properties = new Properties();
            properties.putAll(props);

            initConnection(properties.getProperty("HOSTNAME"),
                           properties.getProperty("PORT"),
                           properties.getProperty("USERNAME"),
                           properties.getProperty("PASSWORD"));

            servicetype = properties.getProperty("SERVICETYPE");
            serviceURI = properties.getProperty("SERVICEURI");
            action = properties.getProperty("ACTION");

            DomainRuntimeServiceMBean domainRuntimeServiceMBean =
                (DomainRuntimeServiceMBean)findDomainRuntimeServiceMBean(connection);

            // Create a session via SessionManagementMBean.
            sessionManagementMBean =
                    (SessionManagementMBean)domainRuntimeServiceMBean.findService(SessionManagementMBean.NAME,
                                                                                  SessionManagementMBean.TYPE,
                                                                                  null);
            sessionManagementMBean.createSession(sessionName);

            if (servicetype.equalsIgnoreCase("ProxyService")) {

                // A Ref uniquely represents a resource, project or folder that is managed by the Configuration Framework.
                // A Ref object has two components: A typeId that indicates whether it is a project, folder, or a resource, and an array of names of non-zero length.
                // For a resource the array of names start with the project name, followed by folder names, and end with the resource name.
                // For a project, the Ref object simply contains one name component, that is, the project name.
                // A Ref object for a folder contains the project name followed by the names of the folders which it is nested under.
                Ref ref = constructRef("ProxyService", serviceURI);

                ProxyServiceConfigurationMBean proxyServiceConfigurationMBean =
                    (ProxyServiceConfigurationMBean)domainRuntimeServiceMBean.findService(ProxyServiceConfigurationMBean.NAME +
                                                                                          "." +
                                                                                          sessionName,
                                                                                          ProxyServiceConfigurationMBean.TYPE,
                                                                                          null);
                if (action.equalsIgnoreCase("Enable")) {
                    proxyServiceConfigurationMBean.enableService(ref);
                    description = "Enabled the service: " + serviceURI;
                    System.out.print("Enabling service " + serviceURI);
                } else if (action.equalsIgnoreCase("Disable")) {
                    proxyServiceConfigurationMBean.disableService(ref);
                    description = "Disabled the service: " + serviceURI;
                    System.out.print("Disabling service " + serviceURI);
                } else {
                    System.out.println("Unsupported value for ACTION");
                }
            } else if (servicetype.equals("BusinessService")) {
                Ref ref = constructRef("BusinessService", serviceURI);

                BusinessServiceConfigurationMBean businessServiceConfigurationMBean =
                    (BusinessServiceConfigurationMBean)domainRuntimeServiceMBean.findService(BusinessServiceConfigurationMBean.NAME +
                                                                                             "." +
                                                                                             sessionName,
                                                                                             BusinessServiceConfigurationMBean.TYPE,
                                                                                             null);
                if (action.equalsIgnoreCase("Enable")) {
                    businessServiceConfigurationMBean.enableService(ref);
                    description = "Enabled the service: " + serviceURI;
                    System.out.print("Enabling service " + serviceURI);
                } else if (action.equalsIgnoreCase("Disable")) {
                    businessServiceConfigurationMBean.disableService(ref);
                    description = "Disabled the service: " + serviceURI;
                    System.out.print("Disabling service " + serviceURI);
                } else {
                    System.out.println("Unsupported value for ACTION");
                }
            }
            sessionManagementMBean.activateSession(sessionName, description);
            System.out.println(" has been succesfully completed");
        } catch (Exception ex) {
            if (sessionManagementMBean != null) {
                try {
                   sessionManagementMBean.discardSession(sessionName);
                    System.out.println(" resulted in an error.");
                } catch (Exception e) {
                    System.out.println("Unable to discard session: " +
                                       sessionName);
                }
            }

            ex.printStackTrace();
        } finally {
            if (connector != null)
                try {
                    connector.close();
                } catch (Exception e) {
                    e.printStackTrace();
                }
        }
    }


    /*
       * Initialize connection to the Domain Runtime MBean Server.
       */

    public static void initConnection(String hostname, String portString,
                                      String username,
                                      String password) throws IOException,
                                                              MalformedURLException {

        String protocol = "t3";
        Integer portInteger = Integer.valueOf(portString);
        int port = portInteger.intValue();
        String jndiroot = "/jndi/";
        String mbeanserver = DomainRuntimeServiceMBean.MBEANSERVER_JNDI_NAME;

        JMXServiceURL serviceURL =
            new JMXServiceURL(protocol, hostname, port, jndiroot +
                              mbeanserver);

        Hashtable hashtable = new Hashtable();
        hashtable.put(Context.SECURITY_PRINCIPAL, username);
        hashtable.put(Context.SECURITY_CREDENTIALS, password);
        hashtable.put(JMXConnectorFactory.PROTOCOL_PROVIDER_PACKAGES,
                      "weblogic.management.remote");
        hashtable.put("jmx.remote.x.request.waiting.timeout", new Long(10000));

        connector = JMXConnectorFactory.connect(serviceURL, hashtable);
        connection = connector.getMBeanServerConnection();
    }


    private static Ref constructRef(String refType, String serviceURI) {
        Ref ref = null;
        String[] uriData = serviceURI.split("/");
        ref = new Ref(refType, uriData);
        return ref;
    }


    /**
     * Finds the specified MBean object
     *
     * @param connection - A connection to the MBeanServer.
     * @return Object - The MBean or null if the MBean was not found.
     */
    public Object findDomainRuntimeServiceMBean(MBeanServerConnection connection) {
        try {
            ObjectName objectName =
                new ObjectName(DomainRuntimeServiceMBean.OBJECT_NAME);
            return (DomainRuntimeServiceMBean)MBeanServerInvocationHandler.newProxyInstance(connection,
                                                                                            objectName);
        } catch (MalformedObjectNameException e) {
            e.printStackTrace();
            return null;
        }
    }


    public static void main(String[] args) {
        try {
            if (args.length <= 0) {
                System.out.println("Provide values for the following parameters: HOSTNAME, PORT, USERNAME, PASSWORD, SERVICETYPE, SERVICEURI, ACTION.);

            } else {
                HashMap<String, String> map = new HashMap<String, String>();

                map.put("HOSTNAME", args[0]);
                map.put("PORT", args[1]);
                map.put("USERNAME", args[2]);
                map.put("PASSWORD", args[3]);
                map.put("SERVICETYPE", args[4]);
                map.put("SERVICEURI", args[5]);
                map.put("ACTION", args[6]);
                OSBServiceState osbServiceState = new OSBServiceState(map);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

    }
}

The post Oracle Service Bus : disable / enable a proxy service via WebLogic Server MBeans with JMX appeared first on AMIS Oracle and Java Blog.

↧

Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring

March 29, 2017, 10:20 am

≫ Next: Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards

≪ Previous: Oracle Service Bus : disable / enable a proxy service via WebLogic Server MBeans with JMX

Introduction

This blogpost is part III, the last part of a trilogy on how to create and use a Canonical Data Model (CDM). The first blogpost contains part I in which I share my experiences in developing a CDM and provide you with lots of standards and guidelines for creating a CDM. The second part is all about XML Namespace Standards. This part is about usage of a CDM in the integration layer, thus how to use it in a run time environment and what are the consequences for the development of the services which are part of the integration layer.

Dependency Management & Interface Tailoring

When you’ve decided to use a CDM, it’s quite tempting to use the XSD files, that make up the CDM, in a central place in the run time environment where all the services can reference to. In this way there is only one model, one ‘truth’ for all the services. However there are a few problems you run into quite fast when using such a central run time CDM.

Dependency Management

Backwards compatibility
The first challenge is to maintain backwards compatibility. This means that when there is a change in the CDM, this change is implemented in such a way that the CDM supports both the ‘old’ data format, according to the CDM before the change, as well as the new data format with the change. When you’re in the development stage of the project, the CDM will change quite frequently, in large projects even on a daily basis. When these changes are backwards compatible, the services which already have been developed and are considered as finished, do not need to change (unless of course the change also involves a functional change of a finished service). Otherwise, when these changes are not backwards compatible, all software components, so all services, which have been finished have to be investigated whether they are hit by the change. Since all services use the same set of central XSD definitions, many will be hit by a change in these definitions.
If you’re lucky you have nice unit tests or other code analysis tools you can use to detect this. You may ask yourself if these test and/or tool will cover a 100% hit range. When services are hit, they have to be modified, tested and released again. To reduce maintenance and rework of all finished services, there will be pressure to maintain backwards compatibility as much as possible.
Maintaining backwards compatibility in practice means

that all elements that are added to the CDM have to be optional;
That you can increase the maximum occurrence of an element, but never reduce it;
That you can make mandatory elements optional, but not vice versa;
And that structure changes are much more difficult.

For example, when a data element has to be split up into multiple elements. Let’s take a product id element of type string and split it up into a container elements that is able to contain multiple product identifications for the same product. The identification container element will have child elements for product id, product id type and an optional owner id for the ‘owner’ of the identification (e.g. a customer may have his own product identification). One way of applying this change and still maintain backwards compatibility is by using an XML choice construction:

<complexType name="tProduct">
  <sequence>
    <choice minOccurs="0" maxOccurs="1">
      <element name="Id" type="string" />
      <element name="Identifications">
        <complexType>
          <sequence>
            <element name="Identification" minOccurs="0" maxOccurs="unbounded">
              <complexType>
                <sequence>
                  <element name="Id" type="string" />
                  <element name="IdType" type="string" />
                  <element name="IdOwner" type="string" minOccurs="0"/>
                </sequence>
              </complexType>
            </element>
          </sequence>
        </complexType>
      </element>
    </choice>
    <element name="Name" type="string" />
    ...
  </sequence>
</complexType>

There are other ways to implement this change and remain backwards compatible, but they will all will into a redundant and verbose data model. As you can imagine, this soon results in a very ugly CDM, which is hard to read and understand.

Hidden functional bugs
There is another danger. When keeping backward compatibility in this way, the services which were finished technically don’t break and still run. But they might functional break! This break is even more dangerous because it may not be visible immediately and it can take quite a long time before this hidden functional bug is discovered. Perhaps the service already runs in a production environment and execute with unnoticed functional bugs!
Take the example above and consider that there has already been a service developed which does something with orders. Besides order handling, it also sends the product id’s in an order to a CRM system, but only for the product id’s in the range 1000-2000. The check in the service on the product id being in the range 1000-2000 will be based upon the original product id field. But what happens if the CDM is changed as described in previous paragraph, so the original product id field is part of a choice and thus becomes optional. This unchanged service now might handle orders that contain products with the newer data definition for a product in where the new “Identification” element is used instead of the old “Id” element. If you’re lucky, the check on the range fails with a run time exception! Lucky, because you’re immediately notified of this functional flaw. It probably will be detected quite early in a test environment when it’s common functionality. But what if it is rare functionality? Then the danger is that it might not be detected and you end up with a run time exception in a production environment. That is not what you want, but at least it is detected!
The real problem is that there is a realistic chance that the check doesn’t throw an exception and doesn’t log an error or warning. It might conclude that the product id is not in the range 1000-2000, because the product id field is not there, while the product identification is in that range! It just uses the new way of data modeling the product identification with the new “Identification” element. This results into a service that has a functional bug while it seems to run correctly!

Backward compatibility in time
Sometimes you have no choice and you have to make changes which are not backward compatible. This can cause another problem: you’re not backwards compatible in time. You might be developing newer versions of services. But what if in production there is a problem with one of these new services using the new CDM and you want to go back to a previous version of that service? You have to go back to the old version of the CDM as well, because the old version is not compatible with the new CDM. But that also means that none of the newer services can run, because they depend on the new CDM. So you have to revert to the old versions for all of the new services using the new CDM!

The base cause of these problems is that all software components (service) are dependent on the central run time CDM!
So this central run CDM introduces dependencies between all (versions of) components. This heavily conflicts with one of the base principles of SOA: loose coupled, independent services.

Interface Tailoring

There is another problem with a central CDM which has more to do with programming concepts, but also impacts the usage of services resulting in a slower development cycle. The interface of a service which is described in its contract (WSDL) should reflect the functionality of a service. However, if you’re using a central CDM, the CDM is used by all the services. This means that the entities in the CDM contain all the data elements which are needed in the contracts of all the services. So basically a CDM entity consists of a ‘merge’ of all these data elements. The result is that the entities will be quite large, detailed and extensive. The services use these CDM entities in their contracts, while functionally only a (small) part of the elements are used in a single service.

This makes the interface of a service very unclear, ambiguous and meaningless.

Another side effect is that it makes no sense to validate (request) messages, because all elements will be optional.

Take for example a simple service that returns the street and city based upon the postal code and house number (this is a very common functionality in The Netherlands). The interface would be nice and clear and almost self-describing when the service contract dictates that the input (request) only is a postal code and the output (response) only contains the street name and the city. But with a central CDM, the input will be an entity of type address, as well as the output. With some bad luck, the address entity also contain all kind of elements for foreign addresses, post office boxes, etc. I’ve seen exactly this example in a real project with an address entity containing more than 30 child elements! While the service only needed four of them: two elements, postal code and house number, as input and also two elements, street and city, as the output. You might consider to this by defining these separate elements as input and output and not to use the entity element. But that’s not the idea of a central CDM! Take notice that this is just a little example. I’ve seen this problem in a project with lawsuit entities. You can imagine how large such an entity can become, with hundreds of elements. Services individually only used some of the elements of the lawsuit entity, but these elements were scattered across the entire entity. So is does not help either to split up the type definition of a lawsuit entity into several sub types. In that project almost all the services needed one or more lawsuit entities resulting in interface contracts (WSDL) which all were very generic and didn’t make sense. You needed the (up to date) documentation of the service in order to know which elements you had to use in the input and which elements were returned as output, because the definitions of the request and response messages were not useful as they contained complete entities.

Solution

The solution to both of the problems described above, is not to use a central run time CDM, but only a design time CDM.
This design time CDM has no namespace (or a dummy one). When a service is developed, a hard copy is made of (a part of) the CDM at that moment to a (source) location specific for that service. Then a service specific namespace has to be applied to this local copy of the (service specific) CDM.
And now you can shape this local copy of the CDM to your needs! Tailor it by removing elements that the service contract (WSDL) doesn’t need. You can also apply more restrictions to the remaining elements by making optional elements mandatory, reduce the maximum occurrences of an element and even create data value restrictions for an element (e.g. set a maximum string length). By doing this, you can tailor the interface in such a way that it reflects the functionality of the service!
You can even have two different versions of an entity in this copy of the CDM. For example one to use in the input message and one in the output message.
Let’s demonstrate this with the example of above: An address with only postal code and house number for the input message and an address with street and city for the output message. The design time CDM contains the full address entity, while the local and tailored copy of the service CDM contains two tailored address entities. And this one can be used by the service XSD which contains the message definitions of the request and response payloads:

CDM XSD and Service XSD

You can expand the source code if you are interested:

<schema targetNamespace="DUMMY_NAMESPACE"
            xmlns="http://www.w3.org/2001/XMLSchema"
            version="1.0">

   <complexType name="TAddress">
      <sequence>
         <element name="Department" type="string" minOccurs="0"/>
         <element name="Street" type="string" minOccurs="0"/>
         <element name="Number" type="string" minOccurs="0"/>
         <element name="PostalCode" type="string" minOccurs="0"/>
         <element name="City" type="string" minOccurs="0"/>
         <element name="County" type="string" minOccurs="0"/>
         <element name="State" type="string" minOccurs="0"/>
         <element name="Country" type="string" minOccurs="0"/>
      </sequence>
   </complexType>

</schema>

<schema targetNamespace="http://nl.amis.AddressServiceCDM"
            xmlns="http://www.w3.org/2001/XMLSchema"
            version="1.0">

   <complexType name="TAddressInput">
      <sequence>
         <element name="Number" type="string" minOccurs="0"/>
         <element name="PostalCode" type="string" minOccurs="1"/>
      </sequence>
   </complexType>

   <complexType name="TAddressOutput">
      <sequence>
         <element name="Street" type="string" minOccurs="1"/>
         <element name="City" type="string" minOccurs="1"/>
      </sequence>
   </complexType>

</schema>

<schema targetNamespace="http://nl.amis.AddressService"
        xmlns="http://www.w3.org/2001/XMLSchema"
        xmlns:cdm="http://nl.amis.AddressServiceCDM"
        version="1.0">

   <import namespace="http://nl.amis.AddressServiceCDM" schemaLocation="AddressServiceCDM.xsd"/>

   <element name="getAddressRequest">
	   <complexType>
		  <sequence>
			 <element name="Address" type="cdm:TAddressInput" minOccurs="1"/>
		  </sequence>
	   </complexType>
   </element>

   <element name="getAddressResponse">
	   <complexType>
		  <sequence>
			 <element name="Address" type="cdm:TAddressOutput" minOccurs="1"/>
		  </sequence>
	   </complexType>
   </element>

</schema>

When you’re finished tailoring, you can still deploy these service interfaces (WSDL) containing the shaped data definitions (XSDs) to a central run time location. However each service must have its own location within this run time location, to store these tailored data definitions (XSDs). When you do this, you can also store the service interface (abstract WSDL) in there as well. In this way there is only one copy of a service interface, that is used by the implementing service as well as by consuming services.
I’ve worked in a project with SOAP services where the conventions dictated that the filename of a WSDL is the same as the name of the service. The message payloads were not defined in this WSDL, but were included from an external XSD file. This XSD also had the same filename as the service name. This service XSD defined the payload of the messages, but it did not contain CDM entities or CDM type definitions. They were included from another XSD with the fixed name CDM.xsd. This local, service specific, CDM.xsd contained the tailored (stripped and restricted) copy of the central design time CDM, but had the same target namespace as the service.wsdl and the service.xsd:
Service Files
This approach also gave the opportunity to add operation specific elements to the message definitions in the service.xsd. These operation specific elements were not part of the central CDM and did not belong there due to their nature (operation specific). These operation specific elements ware rarely needed, but when needed, they did not pollute the CDM, because you don’t need to somehow add them to the CDM. Think of switches and options on operations which act on functionality, e.g. a boolean type element “includeProductDescription” in the request message for operation “getOrder”.

Note: The services in the project all did use a little generic XML of which the definition (XSD) was stored in a central run time location. However these data definitions are technical data fields and therefor are not part of the CDM. For example header fields that are used for security, a generic response entity containing messages (error, warning info) and optional paging information elements in case a response contains a collection. You need a central type definition when you are using generic functionality (e.g. from a software library) in all services and consuming software.

Conclusion
With this approach of a design time CDM and tailored interfaces:

There are no run time dependencies on the CDM and thus no dependencies between (versions of) services
Contract breach and hidden functional bugs are prevented. (Because of different namespaces services have to copy each data element individually when passing an entity or part of an entity, to its output)
Service interfaces reflect the service functionality
Method specific parameters can be added without polluting the CDM
And – most important – the CDM can change without limitations and as often as you want to!

The result is that the CDM in time will grow to a nice clean and mature model that reflects the business data model of the organization – while not impending and even promoting the agility of service development. And that is exactly what you want with a CDM!

When to use a central run time CDM

A final remark about a central run time CDM. There are situations where this can be a good solution. That is for smaller integration projects and in the case when all the systems and data sources which are to be connected with the integration layer are already in place, so they are not being developed. They probably already run in production for a while.
This means that the data and the data format which has to be passed through the integration layer and is used in the services is already fixed. You could state that the CDM already is there, although it still has to be described, documented in a data model. It’s likely that it’s also a project where there is a ‘one go’ to production, instead of frequent delivery cycles.
When after a while one system is replaced by another system or the integration layer is extended by connecting one or more systems and this results that the CDM has to be changed, you can add versioning to the CDM. Create a copy of the existing CDM and give it a new version (e.g. with a version number in the namespace) and you can make the changed in CDM which are needed. This is also a good opportunity to clean up the CDM by removing unwanted legacy due to keeping backwards compatibility. Use this newer version of the CDM for all new development and maintenance of services.
Again, only use this central run time CDM for smaller projects and when it is a ‘one go’ to production (e.g. replacement of one system). As soon as the project becomes larger and/or integration of systems keeps on going, switch over to the design time CDM approach.
You can easily switch over by starting to develop the new services with the design time CDM approach and keep the ‘old’ services running with the central run time CDM. As soon there is a change in an ‘old’ service, refactor it to the new approach of the design time CDM. In time there will be no more services using the run time CDM, so the run time CDM can be removed.

After reading this blogpost, together with the previous two blogpost which make up the trilogy about my experiences with a Canonical Data Model, you should be able to have good understanding about how to set up a CDM and use it in your projects. Hopefully it helps you in making valuable decisions about creating and using a CDM and your projects will benefit from it.

The post Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring appeared first on AMIS Oracle and Java Blog.

↧

Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards

March 29, 2017, 10:21 am

≫ Next: Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines

≪ Previous: Development and Runtime Experiences with a Canonical Data Model Part III: Dependency Management & Interface Tailoring

This blog is about XML namespace standards. Primary for using them in a Canonical Data Model (CDM), but also interesting for anyone who has to define XML data by creating XML Schema files (XSD). This blogpost is the second part of a trilogy about my experiences in using and developing a CDM. The first blogpost is about naming & structure standards and the third blogpost is about dependency management & interface tailoring.

XML Namespace Standards

A very important part of an XML model, is its namespace. With a namespace you can bind an XML model to specific domain and can represent a company, a business domain, a system, a service or even a single component or layer within a service. For a CDM model this means that choices have to be made. Use one or more namespaces. How to deal with newer versions of a CDM, etc.

Two approaches: one generic namespace vs component specific namespaces
Basically I’ve come across two approaches of defining a namespace in a CDM. Both ways can be a good approach, but you have to choose one based on your specific project characteristics.

The first approach is to use one generic fixed namespace for the entire CDM. This may also be the ’empty’ namespace which looks like there is no namespace. This approach of one generic fixed namespace is useful when you have a central CDM that is available at run time and all services refer to this central CDM. When you go for this approach, go for one namespace only, so do not use different namespaces within the CDM.
For maintenance and to keep the CDM manageable, it can be useful to split up the CDM into more definition files (XSD’s), each one representing a different group (domain) of entities. However my advise is to still use the same namespace in all of these definition files. The reason is that in time the CDM will change and you may want to move entities from one group to another group or you wan to split up a group. When each group had its own namespace, you would have gotten a problem with backward compatibility. That’s because an element which moves from one group to another, would then have changed from its namespace.
When at a certain moment you’re going to have a huge amount of changes which also impacts the running software, you can create a new version of the CDM. Examples of such situations are connecting a new external system or replacing an important system by another system. In case you have more versions of the CDM, each version must have its own namespace where the version number is part of the name of the namespace. New functionality can now be developed with the new version of the CDM. When it uses existing functionality (e.g. calling an existing service) it has to transform the data from the new version of the CDM to the old version (and vice versa).
The second approach is that each software component (e.g. a SOAP webservice) has its own specific namespace. This specific namespaces is used as the namespace for a copy of the CDM. The software component uses this copy of the CDM. You can consider it as ‘his’ own copy of the CDM. A central runtime CDM is not needed any more. This means that the software components have no runtime dependencies on the CDM! The result is that the software components can be deployed and run independent of the current version of the CDM. This is the most important advantage!
The way to achieve this is to have a central CDM without a namespace (or a dummy namespace like ‘xxx’), which is only available as an off-line library at design time. So there even is no run time CDM to reference to!
Developers need to create a hard coded copy of the CDM for the software component they are building and apply a namespace to the copy. The name of this namespace is specific for that software component and typically includes the name (and version) of the software component itself. Because the software component is the ‘owner’ of this copy, the parts (entities) of CDM which are not used by the software component, can be removed from this copy.

In part III in my last blogpost about run time dependencies and interface tailoring I will advise when to use the first and when to use the second approach. First some words about XML patterns and their usage in these two namespace approaches.

XML Patterns
XML patterns are design patterns, applicable to the design of XML. Because the design of XML is defined by XML Schema, XSD files, these XML patterns actually are XML Schema (XSD) patterns. These design patterns describe a specific way of modeling XML. Different ways of modeling can result into the same XML, but may be different in terms of maintenance, flexibility, ease of extension, etc.
As far as I know, there are four XML patterns: “Russian Doll”, “Salami Slice”, “Venetian Blind” and “Garden of Eden”. I’m not going to describe these patterns, because that has already be done by others. For a good description of the first three, see http://www.xfront.com/GlobalVersusLocal.html and http://www.oracle.com/technetwork/java/design-patterns-142138.html gives for a brief summary of all four. I advise you to read and understand them when you want to setup an XML type CDM.

I’ve described two approaches of using a CDM above, a central run-time referenced CDM and a design time only CDM. So the question is, which XML design pattern matches best for each approach?

When you’re going for the first approach, a central run-time-referenced CDM, there are no translations necessary when passing (a part of) an XML payload from one service to another service. This is easier compared with the second approach where each service has a different namespace. Because there are no translations necessary and the services need to reference parts of entities as well as entire entity elements, it’s advisable to use the “Salami Slice” or the “Garden of Eden” pattern. They both have all elements defined globaly, so it’s easy to reuse them. With the “Garden of Eden” patterns types are defined globally as well and thus reusable providing more flexibility and freedom to designers and developers. The downside is that you end up with a very scattered and verbose CDM.
So solve this disadvantage, you can go for the “Venetian Blind” pattern and set the schema attribute “elementFormDefault” to “unqualified” and do not include any element definitions in the root of the schema’s (XSD) which make up the CDM. This means there are only XML type definitions in the root of the schema(s), so CDM is defined by types. The software components, e.g. a web service, do have their own namespace. In this way the software components define a namespace (through their XSD or WSDL) for the root element of the payload (in the SOAP body), while all the child elements below this root remain ‘namespace-less’.
This makes the life of an developer easier as there is no namespace and thus no prefixes needed the payloads messages. No dealing with namespaces in all transformation, validation and processing software that works with those messages makes programming code (e.g. xslt) less complicated, so less error prone.
This leads to my advise that:

The “Venetion Blind” pattern with the schema attribute “elementFormDefault” set to “unqualified” and no elements in the root of the schema’s, is the best XML pattern for the approach of using a central run-time referenced CDM.

When you’re going for the second option, no runtime CDM, but only a design time CDM, you shouldn’t use a model which results in payloads (or part of the payloads) of different services having exact the same namespace. So you cannot use the “Venetian Blind” pattern with “elementFormDefault” set to “unqualified” which I have just explained. You can still can use the “Salami Slice” or “Garden of Eden” pattern, but the disadvantages of large scattered and verbose CDM remain.
The reason that you can not have the same namespace for the payload of services with this approach is because the services have their own copy (‘version’) of the CDM. When (parts of) payloads of different services have the same element with also the same namespace (or the empty namespace), the XML structure of both is considered to be exactly equal, while that need not be the case!. When they are not the same you have a problem when services need to call each other and payloads are passed to each other. They can already be different at design time and then it’s quite obvious.
Much more dangerous is that they even can become different later in time without even being noticed! To explain this, assume that at a certain time two software components were developed, they used the same CDM version, so the XML structure was the same. But what if one of them changes later in time and these changes are considered as backwards compatible (resulting in a new minor version). The design time CDM has changed, so the newer version of this service uses this newer CDM version. The other service did not change and now receives a payload from the changed service with elements of a newer version of the CDM. Hopefully this unchanged service can handle this new CDM format correctly, but it might not! Another problem is that it might break its own contract (WSDL) when this service copies the new CDM entities (or part of it) to its response of caller. Thus breaking its own contract while the service itself has not changed! Keep in mind its WSDL still uses the old CDM definitions of the entities in the payload.
Graphically explained:
Breach of Service Contract
Service B calls Service A and retrieves a (part of) the payload entity X from Service A. Service B uses this entity to return it to his consumers as (part of) payload. This is all nice and correct according to its service contract (WSDL).
Later in time, Service A is updated to version 1.1 and the newer version of the CDM is used in this updated version. In the newer CDM version, entity X has also been updated to X’. Now this X’ entity is passed from Service A to Service B. Service B returns this new entity X’ to its consumers, while they expect the original X entity. So service B returns an invalid response and breaks its own contract!
You can imagine what happens when there is a chain of services and probably there are more consumers of Service A. Such an update can spread out through the entire integration layer (SOA environment) like ripples on water!
You don’t want to update all the services in the chains effected by such a little update.
I’m aware a service should not do this. Theoretically a service is fully responsible that always complies to its own contract (WSDL), but this is very difficult to implement this when developing lots of services. When there is a mapping in a service, this is quite clear, but all mapping should be checked. However an XML entity often is used as variable (e.g. BPEL) in some processing code and can be passed to a caller unnoticed.
The only solution is to avoid passing complete entities (container elements), so, when passing through, all data fields (data elements) have to be mapped individually (in a so called transformation) for all incoming and outgoing data of the service.
The problem is that you cannot enforce software to do this, so this must become a rule, a standard, for software developers.
Everyone, who has been in a software development for some years, knows this is not going to work. There will always be a software developer (at that moment or maybe in future for maintenance) not knowing or understanding this standard.
The best way to prevent this problem, is to give each service its own namespace, so entities (container elements) cannot be copied and passed through in its entirety and thus developers have to map the data elements individually.

This is why I advise for the approach of a design time only CDM to also use the “Venetian Blind” pattern, but now with the schema attribute “elementFormDefault” set to “qualified”. This results into a CDM of which

it is easy to copy the elements that are needed, including child elements and necessary types, from the design time CDM to the runtime constituents of the software component being developed. Do not forget to apply the component specific target namespace to this copy.

it is possible to reuse type definitions within the CDM itself, preventing multiple definitions of the same entity.

In my next blogpost, part III about runtime dependencies and interface tailoring, I explain why you should go in most cases for a design time CDM and not a central runtime CDM.

The post Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards appeared first on AMIS Oracle and Java Blog.

↧

Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines

March 29, 2017, 10:21 am

≫ Next: Machine learning: Getting started with random forests in R

≪ Previous: Development and Runtime Experiences with a Canonical Data Model Part II: XML Namespace Standards

Introduction

In my previous blog I’ve explained what a Canonical Data Model (CDM) is and why you should use it. This blog is about how to do this. I will share my experiences on how to create and use a CDM. I gained these experiences at several projects, small ones, and large ones. All of these experiences were related to an XML based CDM. This blog consists of three parts. This blogpost contains part I: Standards & Guidelines. The next blogpost, part two, is about XML Namespace Standards and the last blogpost contains part three about Dependency Management & Interface Tailoring.
This first part, about standards and naming conventions, primarily apply to XML, but the same principles and ideas will mostly apply to other formats, like JSON, as well. The second part about XML namespace standards only is, as it already indicates, applicable to an XML format CDM. The last part, in the third blogpost, about dependency management & interface tailoring entirely, applies to all kind of data formats.

Developing a CDM

About the way of creating a CDM. It’s not doable to create a complete CDM upfront and only then start designing services and developing them. This is because you only can determine usage of data, completeness and quality while developing the services and gaining experience in using them. A CDM is a ‘living’ model and will change in time.
When the software modules (systems or data stores) which are to be connected by the integration layer are being developed together, the CDM will change very often. While developing software you always encounter shortcomings in the design, unseen functional flaws, unexpected requirements or restrictions and changes in design because of new insights or changed functionality. So sometimes the CDM will even change on a daily base. This perfectly fits into the modern Agile Software Development methodologies, like Scrum, where changes are welcome.
When the development stage is finished and the integration layer (SOA environment) is in a maintenance stage, the CDM still will change, but at a much slower pace. It will keep on changing because of maintenance changes and modifications of connected systems or trading partners. Changes and modifications due to new functionality also causes new data entities and structures which have to be added to the CDM. These changes and modifications occur because business processes change in time, caused by a changing world, ranging from technical innovations to social behavioral changes.
In either way, the CDM will never be ready and reach a final changeless state, so a CDM should be flexible and created in such a way that it welcomes changes.

When you start creating a CDM, it’s wise to define standards and guidelines about defining the CDM and using it beforehand. Make a person (or group of persons in a large project), responsible for developing and defining the CDM. This means he defines the data definitions and structures of the CDM. When using XML this person is responsible for creating and maintaining the XML schema definition (XSD) files which represent the CDM. He develops the CDM based on requests from developers and designers. He must be able to understand the need of the developers, but he should also keep the model consistent, flexible and future proof. This means he must have experience in data modeling and the data format (e.g. XML or JSON) and specification language (e.g. XSD) being used. Of course, he also guards the standards and guidelines which has been set. He also is able, when needed, to deny requests for a CDM change from (senior) developers and designers in order to preserve a well-defined CDM and provide an alternative which meets their needs as well.

Standards & Guidelines

There are more or less three types of standards and guidelines when defining an XML data model:

Naming Conventions
Structure Standards
Namespace Standards

Naming Conventions

The most important advice is that you define naming conventions upfront and stick to them. Like all the naming convention in programming languages, there are a lot of options and often it’s a matter of personal preference. Changing conventions because of different personal preferences it not a good idea. Mixed conventions results in ugly code. Nevertheless I do have some recommendations.

Nodes versus types
The first one is to make a distinction between the name of a node (element or attribute) and an XML type. I’ve been in a project where the standard was to give them exactly the same name. In XML this is possible! But the drawback was that there were connecting systems and programming languages which couldn’t handle this! For example the standard Java library for XML parsing, JAX-P, had an issue with this. The Java code which was generated under the hood used the name of an XML type for a Java class name and the name of an element as a Java variable name. In Java it is not possible to use an identical name for both. In that specific project, this had to be fixed manually in the generated Java source code. That is not what you want! It can easily be avoided by using different names for types and elements.

Specific name for types
A second recommendation, which complements the advice above, is to use a specific naming convention for XML types, so their names always differ from node names. The advantage for developers is that they can recognize from the name if something is an XML node or an XML type. This eases XML development and makes the software code easier to read and understand and thus to maintain.
Often I’ve seen the naming convention, which tries to implements this, by prescribing that the name of an XML type should be suffixed with the token “Type”. I personally do not like this specific naming convention. Consider you have a “Person” entity and so you end up with an XML type named “PersonType”. This perfectly makes sense, doesn’t it? But how about a “Document” entity? You end up with an XML type named “DocumentType” and guess what: there is also going to be a “DocumentType” entity resulting in an XML type named “DocumentTypeType”…!? Very confusing in the first place. Secondly, you end up with an element and an XML type with the same name! The name “DocumentType” is used as a name for an element (of type “DocumentTypeType”) and “DocumentType” is used as an XML type (of an element named “Document”).
From experience I can tell you there are more entities with a name that ends with “Type” than you would expect!
My advice is to prefix an XML type with the character “t”. This not only prevents this problem, but it’s also shorter. Additionally you can distinguish an XML node from an XML type by the start of its name. This naming convention results into element names like “Person”, “Document” and “DocumentType” versus type names “tPerson”, “tDocument” and “tDocumentType”.

Use CamelCase – not_underscores
The third recommendation is to use Camel Case for names instead of using underscores as separator between the words which make up a name of a node or type. This shortens a name and still the name can be read easily. I’ve got a slight preference to start a name with an uppercase character, because then I can use camel Case beginning with a lowercase character for local variables in logic or translations (BPEL, xslt, etc) in the integration layer or tooling. This results in a node named “DocumentType” of type “tDocumentType” and when used in a local variable in code, this variable is named “documentType”.

Structure Standards

I also have some recommendations about standards which apply to the XML structure of the CDM.

Use elements only
The first one is to never use attributes, so only elements. You can never expand an attribute and create child elements in it. This may not be necessary at the moment, but may be necessary sometime in the future. Also an attribute cannot have the ‘null’ value in contrast with an element. You can argue that an empty value can represent the null value. But this is only possible with String type attributes (otherwise it’s considered as invalid XML when validating against its schema) and often there is a difference between an empty string and a null value. Another disadvantage is that you can not have multiple attributes with the same name inside an element.
Furthermore, using elements makes XML better readable by humans, so this helps developers in their coding and debugging. A good read about this subject is “Principles of XML design: When to use elements versus attributes”. This article contains a nice statement: “Elements are the extensible engine for expressing structure in XML.” And that’s exactly what you want when developing a CDM that will change in time.
The last advantage is that when the CDM only consists of elements, processing layers can add their own ‘processing’ attributes only for the purpose of helping the processing itself. This means that the result, the XML which is used in communicating with the world outside of the processing system, should be free of attributes again. Also processing attributes can be added in the interface, to provide extra information about the functionality of the interface. For example, when retrieving orders with operation getOrders, you might want to indicate for each order whether it has to be returned with or without customer product numbers:

<getOrdersRequest>
  <Orders>
    <Order includeCustProdIds='false'>
      <Id>123</Id>
    </Order>
    <Order includeCustProdIds='true'>
      <Id>125</Id>
    </Order>
    <Order includeCustProdIds='false'>
      <Id>128</Id>
    </Order>
  </Orders>
</getOrdersRequest>

Beware these attributes are processing or functionality related, so they should not be a data part the entity. And ask yourself if they are really necessary. You might consider to provide this extra functionality in a new operation, e.g. operation getCustProdIds to retrieve customer product ids or operation getOrderWithCustIds to retrieve order with customer product number.

All elements optional
The next advice is to make all the elements optional! There unexpectedly always is a system or business process which doesn’t need a certain (child) element of which you initially had thought it would always be necessary. On one project this was the case with id elements. Each data entity must have an id element, because the id element contains the functional unique identifying value for the data entity. But then there came a business case with a front end system that had screens in which the data entity was being created. Some of the input data had to be validated before the unique identifying value was known. So the request to the validation system contained the entity data without the identifying id element, so the mandatory id element had to be changed to an optional element. Of-course, you can solve this by creating a request which only contains the data that is used in separate elements, so without the use of the CDM element representing the entity. But one of the powers of a CDM is that there is one definition of an entity.
At that specific project, in time, more and more mandatory elements turned out to be optional somewhere. Likely this will happen at your project as well!

Use a ‘plural container’ element
There is, of course, an exception of an element which should be mandatory. That is the ‘plural container’ element, which only is a wrapper element around a single element which may occur multiple times. This is my next recommendation: when a data entity (XML structure) contains another data entity as a child element and this child element occurs two or more times, or there is a slight chance that this will happen in the future, then create a mandatory ‘plural container’ element which acts as a wrapper element that contains these child elements. A nice example of this is an address. More often than you might think, a data entity contains more than one address. When you have an order as data entity, it may contain a delivery address and a billing address, while you initially started with only the delivery address. So when initially there is only one address and the XML is created like this:

<Order>
  <Id>123</Id>
  <CustomerId>456/<CustomerId>
  <Address>
    <Street>My Street</Street>
    <ZipCode>23456</ZipCode>
    <City>A-town</City>
    <CountryCode>US</CountryCode>
    <UsageType>Delivery</UsageType>
  </Address>
  <Product>...</Product>
  <Product>...</Product>
  <Product>...</Product>
</Order>

Then you have a problem with backwards compatibility when you have to add the billing address. This is why it’s wise to create a plural container element for addresses, and for products as well. The name of this element will be the plural of the element it contains. Above XML will then become like this:

<Order>
  <Id>123</Id>
  <CustomerId>456/<CustomerId>
  <Addresses>
    <Address>
      <Street>My Street</Street>
      <ZipCode>23456</ZipCode>
      <City>A-town</City>
      <CountryCode>US</CountryCode>
      <UsageType>Delivery</UsageType>
    </Address>
  </Addresses>
  <Products>
    <Product>...</Product>
    <Product>...</Product>
    <Product>...</Product>
  </Products>
</Order>

In the structure definition, the XML Schema Definition (XSD), define the plural container element to be single and mandatory. Make its child elements optional and without a maximum of occurrences. First this results in maximum flexibility and second, in this way there is only one way of constructing XML data that doesn’t have any child elements. In contrast, when you make the plural container element optional, you can create XML data that doesn’t have any child element in two ways, by omitting the plural container element completely and by adding it without any child elements. You may want to solve this by dictating that child elements always have at least one element, but then the next advantage, discussed below, is lost.
So the XML data example of above will be modeled as follows:

<complexType name="tOrder">
  <sequence>
    <element name="Id" type="string" minOccurs="0" maxOccurs="1"/>
    <element name="CustomerId" type="string" minOccurs="0" maxOccurs="1"/>
    <element name="Addresses" minOccurs="1" maxOccurs="1">
      <complexType>
        <sequence>
          <element name="Address" type="tns:tAddress" minOccurs="0" maxOccurs="unbounded"/>
        </sequence>
      </complexType>
    </element>
    <element name="Products" minOccurs="1" maxOccurs="1">
      <complexType>
        <sequence>
          <element name="Product" type="tns:tProduct" minOccurs="0" maxOccurs="unbounded"/>
        </sequence>
      </complexType>
    </element>
  </sequence>
</complexType>
<complexType name="tAddress">
  <sequence>
    ...
  </sequence>
</complexType>
<complexType name="tProduct">
  <sequence>
    ...
  </sequence>
</complexType>

There is another advantage of this construction for developers. When there is a mandatory plural container element, this elements acts as a kind of anchor or ‘join point’ when XML data has be modified in the software and for example, child elements have to be added. As this element is mandatory, it’s always present in the XML data that has to be changed, even if there are no child elements yet. So the code of a software developer can safely ‘navigate’ to this element and make changes, e.g. adding child elements. This eases the work of a developer.

Be careful with restrictions
You never know beforehand with which systems or trading partners the integration layer will connect in future. When you define restrictions in your CDM, beware of this. For example restricting a string type to a list of possible values (enumeration) is very risky. What to do when in future another possible value is added?
Even a more flexible restriction, like a regular expression can soon become too strict as well. Take for example the top level domain names on internet. It once was restricted to two character abbreviations for countries, some other three character abbreviations (“net”, “com”, “org”, “gov”, “edu”) and one four character word “info”, but that’s history now!
This risk applies for all restrictions, restriction on character length, numeric restrictions, restriction on value ranges, etc.
Likewise I bet that the length of product id’s in the new version of your ERP system will exceed the current one.
My advice is to minimize restriction as much as possible in your CDM, preferable no restrictions at all!
Instead define restrictions on the interfaces, the API to the connection systems. When for example the product id of your current ERP system is restricted to 8 characters, it perfectly makes sense that you define a restriction on the interface with that system. More on this in part III in my last blogpost in the section about Interface Tailoring.

String type for id elements
Actually this one is the same as the one above about restrictions. I want to discuss this one separately, because of its importance and because it often goes wrong. Defining an id element as a numeric type is a way of applying a nummeric restriction to a string type id.
The advice is to make all identifying elements (id, code, etc) of type string and never a numeric type! Even when they always get a numeric value… for now! The integration layer may in future connect to another system that uses non-numeric values for an id element or an existing system may be replaced by a system that uses non-numeric id’s. Only make those elements numeric which truly contain numbers, so the value has a nummeric meaning. You can check this by asking yourself whether it functionally makes sense to calculate with the value or not. So for example phone numbers should be strings. Also when there is a check (algorithm) based on the sequence of the digits whether a number is valid or not (e.g. bank account check digit), this means the number serves as an identification and thus should be a string type element! Another way to detect numbers which are used as identification, is to determine if it matters when you add a preceding zero to the value. If that does matter, it means it’s not used nummeric. After all, preceding zero’s doesn’t change a nummeric value.

Determine null usage
The usage of the null value in XML (xsi:nil=”true”) always leads to lots of discussions. The most import advice is to explicitly define standards & rules and communicate them! Decide whether the null usage is allowed or not. If so, determine in what situation it is allowed and what it functionally means. Ask yourself how it is used and how it differs from an element being absent (optional elements).
For example I’ve been in a project where a lot of data was updated in the database. An element being absent meant that a value didn’t change, while a null value meant that for a container element it’s record had be deleted and for a ‘value’ element that the database value had to be set to null.
The most important advice in this is: Make up your mind, decide, document and communicate it!

To summarize this first part of naming conventions and guidelines:

Keep in mind that a CDM keeps on changing, so it’s never finished

Define naming and structure standards upfront

and communicate your standards and guidelines!

When creating a CDM in the XML format, you also have to think about namespaces and how to design the XML. This is where the second part in my next blogpost is all about. When you are not defining a CDM in the XML format, you can skip this one and immediately go to the third and last blogpost about dependency management & interface tailoring.

The post Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines appeared first on AMIS Oracle and Java Blog.

↧

Machine learning: Getting started with random forests in R

April 7, 2017, 12:08 am

≫ Next: Better track the Usage of Database Options and Management Packs, or it will cost you

≪ Previous: Development and Runtime Experiences with a Canonical Data Model Part I: Standards & Guidelines

According to Gartner, machine learning is on top of the hype cycle at the peak of inflated expectations. There is a lot of misunderstanding about what machine learning actually is and what it can be done with it.

Machine learning is not as abstract as one might think. If you want to get value out of known data and do predictions for unknown data, the most important challenge is asking the right questions and of course knowing what you are doing, especially if you want to optimize your prediction accuracy.

In this blog I’m exploring an example of machine learning. The random forest algorithm. I’ll provide an example on how you can use this algorithm to do predictions. In order to implement a random forest, I’m using R with the randomForest library and I’m using the iris data set which is provided by the R installation.

The Random Forest

A popular method of machine learning is by using decision tree learning. Decision tree learning comes closest to serving as an off-the-shelf procedure for data mining (see here). You do not need to know much about your data in order to be able to apply this method. The random forest algorithm is an example of a decision tree learning algorithm.

Random forest in (very) short

How it works exactly takes some time to figure out. If you want to know details, I recommend watching some youtube recordings of lectures on the topic. Some of its most important features of this method:

A random forest is a method to do classifications based on features. This implies you need to have features and classifications.
A random forest generates a set of classification trees (an ensemble) based on splitting a subset of features at locations which maximize information gain. This method is thus very suitable for distributed parallel computation.
Information gain can be determined by how accurate the splitting point is in determining the classification. Data is split based on the feature at a specific point and the classification on the left and right of the splitting point are checked. If for example the splitting point splits all data of a first classification from all data of a second classification, the confidence is 100%; maximum information gain.
A splitting point is a branching in the decision tree.
Splitting points are based on values of features (this is fast)
A random forest uses randomness to determine features to look at and randomness in the data used to construct the tree. Randomness helps reducing compute time.
Each tree gets to see a different dataset. This is called bagging.
Tree classification confidences are summed and averaged. Products of the confidences can also be taken. Individual trees have a high variance because they have only seen a small subset of data. Averaging helps creating a better result.
With correlated features, strong features can end up with low scores and the method can be biased towards variables with many categories.
A random forest does not perform well with unbalanced datasets; samples where there are more occurrences of a specific class.

Use case for a random forest

Use cases for a random forest can be for example text classification such as spam detection. Determine if certain words are present in a text can be used as a feature and the classification would be spam/not spam or even more specific such as news, personal, etc. Another interesting use case lies in genetics. Determining if the expression of certain genes is relevant for a specific disease. This way you can take someone’s DNA and determine with a certain confidence if someone will contract a disease. Of course you can also take other features into account such as income, education level, smoking, age, etc.

R

Why R

I decided to start with R. Why? Mainly because it is easy. There are many libraries available and there is a lot of experience present worldwide; a lot of information can be found online. R however also has some drawbacks.

Some benefits

It is free and easy to get started. Hard to master though.
A lot of libraries are available. R package management works well.
R has a lot of users. There is a lot of information available online
R is powerful in that if you know what you are doing, you require little code doing it.

Some challenges

R loads datasets in memory
R is not the best at doing distributed computing but can do so. See for example here
The R syntax can be a challenge to learn

Getting the environment ready

I decided to install a Linux VM to play with. You can also install R and R studio (the R IDE) on Windows or Mac. I decided to start with Ubuntu Server. I first installed the usual things like a GUI. Next I installed some handy things like a terminal emulator, Firefox and stuff like that. I finished with installing R and R-studio.

So first download and install Ubuntu Server (next, next, finish)

sudo apt-get update
sudo apt-get install aptitude

–Install a GUI
sudo aptitude install –without-recommends ubuntu-desktop

— Install the VirtualBox Guest additions
sudo apt-get install build-essential linux-headers-$(uname -r)
Install guest additions (first mount the ISO image which is part of VirtualBox, next run the installer)

— Install the below stuff to make Dash (Unity search) working
http://askubuntu.com/questions/125843/dash-search-gives-no-result
sudo apt-get install unity-lens-applications unity-lens-files

— A shutdown button might come in handy
sudo apt-get install indicator-session

— Might come in handy. Browser and fancy terminal application
sudo apt-get install firefox terminator

–For the installation of R I used the following as inspiration: https://www.r-bloggers.com/how-to-install-r-on-linux-ubuntu-16-04-xenial-xerus/
sudo echo “deb http://cran.rstudio.com/bin/linux/ubuntu xenial/” | sudo tee -a /etc/apt/sources.list
gpg –keyserver keyserver.ubuntu.com –recv-key E084DAB9
gpg -a –export E084DAB9 | sudo apt-key add –
sudo apt-get update
sudo apt-get install r-base r-base-dev

— For the installation of R-studio I used: https://mikewilliamson.wordpress.com/2016/11/14/installing-r-studio-on-ubuntu-16-10/

wget http://ftp.ca.debian.org/debian/pool/main/g/gstreamer0.10/libgstreamer0.10-0_0.10.36-1.5_amd64.deb
wget http://ftp.ca.debian.org/debian/pool/main/g/gst-plugins-base0.10/libgstreamer-plugins-base0.10-0_0.10.36-2_amd64.deb
sudo dpkg -i libgstreamer0.10-0_0.10.36-1.5_amd64.deb
sudo dpkg -i libgstreamer-plugins-base0.10-0_0.10.36-2_amd64.deb
sudo apt-mark hold libgstreamer-plugins-base0.10-0
sudo apt-mark hold libgstreamer0.10

wget https://download1.rstudio.org/rstudio-1.0.136-amd64.deb
sudo dpkg -i rstudio-1.0.136-amd64.deb
sudo apt-get -f install

Doing a random forest in R

R needs some libraries to do random forests and create nice plots. First give the following commands:

#to do random forests
install.packages(“randomForest”)

#to work with R markdown language
install.packages(“knitr”)

#to create nice plots
install.packages(“ggplot2”)

In order to get help on a library you can give the following command which will give you more information on the library.

library(help = “randomForest”)

Of course, the randomForest implementation does have some specifics:

it uses the reference implementation based on CART trees
it is biased in favor of continuous variables and variables with many categories

A simple program to do a random forest looks like this:

#load libraries
library(randomForest)
library(knitr)
library(ggplot2)

#random numbers after the set.seed(10) are reproducible if I do set.seed(10) again
set.seed(10)

#create a training sample of 45 items from the iris dataset. replace indicates items can only be present once in the dataset. If replace is set to true, you will get Out of bag errors.
idx_train <- sample(1:nrow(iris), 45, replace = FALSE)

#create a data.frame from the data which is not in the training sample
tf_test <- !1:nrow(iris) %in% idx_train

#the column ncol(iris) is the last column of the iris dataset. this is not a feature column but a classification column
feature_columns <- 1:(ncol(iris)-1)

#generate a randomForest.
#use the feature columns from training set for this
#iris[idx_train, ncol(iris)] indicates the classification column
#importance=TRUE indicates the importance of features in determining the classification should be determined
#y = iris[idx_train, ncol(iris)] gives the classifications for the provided data
#ntree=1000 indicates 1000 random trees will be generated
model <- randomForest(iris[idx_train, feature_columns], y = iris[idx_train, ncol(iris)], importance = TRUE, ntree = 1000)

#print the model
#printing the model indicates how the sample dataset is distributed among classes. The sum of the sample classifications is 45 which is the sample size. OOB rate indicates ‘out of bag’ (the overall classification error).

print(model)

#we use the model to predict the class based on the feature columns of the dataset (minus the sample used to train the model).
response <- predict(model, iris[tf_test, feature_columns])

#determine the number of correct classifications
correct <- response == iris[tf_test, ncol(iris)]

#determine the percentage of correct classifications
sum(correct) / length(correct)

#print a variable importance (varImp) plot of the randomForest
varImpPlot(model)

#in this dataset the petal length and width are more important measures to determine the class than the sepal length and width.

The post Machine learning: Getting started with random forests in R appeared first on AMIS Oracle and Java Blog.

↧

Better track the Usage of Database Options and Management Packs, or it will cost you

April 21, 2017, 4:34 am

≫ Next: R: Utilizing multiple CPUs

≪ Previous: Machine learning: Getting started with random forests in R

So here it is
Oracle announces a license audit, some urgency kicks in and this familiar but also really serious question comes down from management: “Are we using any unlicensed database features“. The seriousness is quite understandable, because if so, the company can look forward to some negotiations with Oracle over license fees, possibly resulting in considerable and unforeseen extra costs.

Tracking… why
To be able to provide a swift and correct answer to this question, I track the usage of database options and management packs. As you might expect, tracking also enables detection of any deliberate or accidental unlicensed feature usage, so I can stop it sooner than later. And stopping it sooner is better because usage during months or years isn’t as easily excused by Oracle as usage during a day or week.

Tracking… how
Tracking is done by way of 2 views, both derived from “options_packs_usage_statistics.sql“, provided by Oracle Support –> MOS Note 1317265. Recently this script has been updated to handle version 12.2, so I had to update my views too. The Oracle script can be used on database version 11gR2 and higher, and on 12c container as well as non-container 12c databases. My views can also be used on 11gR2 databases and higher ( EE, SE and SE2 ), but assume a non-container database.

Bugs
Some bugs (Doc ID 1309070.1) are associated with DBA_FEATURE_USAGE_STATISTICS, the main data source for “options_packs_usage_statistics.sql“. At this time they mention false positives over the use of compression or encryption with Secure Files and RMAN, and with the reporting of Oracle Spatial usage where only Oracle Locator is used.

Disclaimer
The following code provide usage statistics for Database Options, Management Packs and their corresponding features.
This information is to be used for informational purposes only and does not represent any license entitlement or requirement.

SET DEFINE OFF;
CREATE OR REPLACE FORCE VIEW FEATURE_USAGE
AS
select product
     , decode(usage, 'NO_USAGE','NO', usage ) "Used"
     , last_sample_date
     , first_usage_date
     , last_usage_date
------- following sql is based on options_packs_usage_statistics.sql  --> MOS Note 1317265.1
from (
with
MAP as (
-- mapping between features tracked by DBA_FUS and their corresponding database products (options or packs)
select '' PRODUCT, '' feature, '' MVERSION, '' CONDITION from dual union all
SELECT 'Active Data Guard'                                   , 'Active Data Guard - Real-Time Query on Physical Standby' , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Active Data Guard'                                   , 'Global Data Services'                                    , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Analytics'                                  , 'Data Mining'                                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'ADVANCED Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Advanced Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Backup HIGH Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup LOW Compression'                                  , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup MEDIUM Compression'                               , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup ZLIB Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Data Guard'                                              , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^11\.2\.0\.[1-3]\.'         , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^(11\.2\.0\.[4-9]\.|12\.)'  , 'INVALID' from dual union all -- licensing required by Optimization for Flashback Data Archive
SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^11\.2|^12\.1'              , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^12\.[2-9]'                 , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.1'                     , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.[2-9]'                 , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Information Lifecycle Management'                        , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Advanced Network Compression Service'             , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'SecureFile Compression (user)'                           , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'SecureFile Deduplication (user)'                         , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'ASO native encryption and checksumming'                  , '^11\.2|^12\.'               , 'INVALID' from dual union all -- no longer part of Advanced Security
SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^11\.2'                     , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^12\.'                      , 'INVALID' from dual union all -- licensing required only by encryption to disk
SELECT 'Advanced Security'                                   , 'Data Redaction'                                          , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Encrypted Tablespaces'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
SELECT 'Advanced Security'                                   , 'SecureFile Encryption (user)'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Transparent Data Encryption'                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Change Management Pack'                              , 'Change Management Pack'                                  , '^11\.2'                     , ' '       from dual union all
SELECT 'Configuration Management Pack for Oracle Database'   , 'EM Config Management Pack'                               , '^11\.2'                     , ' '       from dual union all
SELECT 'Data Masking Pack'                                   , 'Data Masking Pack'                                       , '^11\.2'                     , ' '       from dual union all
SELECT '.Database Gateway'                                   , 'Gateways'                                                , '^12\.'                      , ' '       from dual union all
SELECT '.Database Gateway'                                   , 'Transparent Gateway'                                     , '^12\.'                      , ' '       from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Aggregation'                                   , '^12\.'                      , ' '       from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.0'            , 'BUG'     from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.[^0]|^12\.2'  , ' '       from dual union all
SELECT 'Database Vault'                                      , 'Oracle Database Vault'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Database Vault'                                      , 'Privilege Capture'                                       , '^12\.'                      , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'ADDM'                                                    , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Baseline'                                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Baseline Template'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Report'                                              , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Automatic Workload Repository'                           , '^12\.'                      , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Baseline Adaptive Thresholds'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Baseline Static Computations'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Diagnostic Pack'                                         , '^11\.2'                     , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'EM Performance Page'                                     , '^12\.'                      , ' '       from dual union all
SELECT '.Exadata'                                            , 'Exadata'                                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT '.GoldenGate'                                         , 'GoldenGate'                                              , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.1'                     , 'BUG'     from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.[2-9]'                 , ' '       from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Sun ZFS with EHCC'                                       , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'ZFS Storage'                                             , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
SELECT 'Label Security'                                      , 'Label Security'                                          , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Multitenant'                                         , 'Oracle Multitenant'                                      , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
SELECT 'Multitenant'                                         , 'Oracle Pluggable Databases'                              , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
SELECT 'OLAP'                                                , 'OLAP - Analytic Workspaces'                              , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'OLAP'                                                , 'OLAP - Cubes'                                            , '^12\.'                      , ' '       from dual union all
SELECT 'Partitioning'                                        , 'Partitioning (user)'                                     , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Partitioning'                                        , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
SELECT '.Pillar Storage'                                     , 'Pillar Storage'                                          , '^12\.'                      , ' '       from dual union all
SELECT '.Pillar Storage'                                     , 'Pillar Storage with EHCC'                                , '^12\.'                      , ' '       from dual union all
SELECT '.Provisioning and Patch Automation Pack'             , 'EM Standalone Provisioning and Patch Automation Pack'    , '^11\.2'                     , ' '       from dual union all
SELECT 'Provisioning and Patch Automation Pack for Database' , 'EM Database Provisioning and Patch Automation Pack'      , '^11\.2'                     , ' '       from dual union all
SELECT 'RAC or RAC One Node'                                 , 'Quality of Service Management'                           , '^12\.'                      , ' '       from dual union all
SELECT 'Real Application Clusters'                           , 'Real Application Clusters (RAC)'                         , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Real Application Clusters One Node'                  , 'Real Application Cluster One Node'                       , '^12\.'                      , ' '       from dual union all
SELECT 'Real Application Testing'                            , 'Database Replay: Workload Capture'                       , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT 'Real Application Testing'                            , 'Database Replay: Workload Replay'                        , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT 'Real Application Testing'                            , 'SQL Performance Analyzer'                                , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT '.Secure Backup'                                      , 'Oracle Secure Backup'                                    , '^12\.'                      , 'INVALID' from dual union all  -- does not differentiate usage of Oracle Secure Backup Express, which is free
SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^11\.2'                     , 'INVALID' from dual union all  -- does not differentiate usage of Locator, which is free
SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^12\.'                      , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'Automatic Maintenance - SQL Tuning Advisor'              , '^12\.'                      , 'INVALID' from dual union all  -- system usage in the maintenance window
SELECT 'Tuning Pack'                                         , 'Automatic SQL Tuning Advisor'                            , '^11\.2|^12\.'               , 'INVALID' from dual union all  -- system usage in the maintenance window
SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^11\.2'                     , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^12\.'                      , 'INVALID' from dual union all  -- default
SELECT 'Tuning Pack'                                         , 'SQL Access Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Monitoring and Tuning pages'                         , '^12\.'                      , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Profile'                                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Tuning Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Tuning Set (user)'                                   , '^12\.'                      , 'INVALID' from dual union all -- no longer part of Tuning Pack
SELECT 'Tuning Pack'                                         , 'Tuning Pack'                                             , '^11\.2'                     , ' '       from dual union all
SELECT '.WebLogic Server Management Pack Enterprise Edition' , 'EM AS Provisioning and Patch Automation Pack'            , '^11\.2'                     , ' '       from dual union all
select '' PRODUCT, '' FEATURE, '' MVERSION, '' CONDITION from dual
),
FUS as (
-- the current data set to be used: DBA_FEATURE_USAGE_STATISTICS or CDB_FEATURE_USAGE_STATISTICS for Container Databases(CDBs)
select
    0 as CON_ID,
    NULL as CON_NAME,
    -- Detect and mark with Y the current DBA_FUS data set = Most Recent Sample based on LAST_SAMPLE_DATE
      case when DBID || '#' || VERSION || '#' || to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS') =
                first_value (DBID    )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                first_value (VERSION )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                first_value (to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS'))
                                               over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc)
           then 'Y'
           else 'N'
    end as CURRENT_ENTRY,
    NAME            ,
    LAST_SAMPLE_DATE,
    DBID            ,
    VERSION         ,
    DETECTED_USAGES ,
    TOTAL_SAMPLES   ,
    CURRENTLY_USED  ,
    FIRST_USAGE_DATE,
    LAST_USAGE_DATE ,
    AUX_COUNT       ,
    FEATURE_INFO
from DBA_FEATURE_USAGE_STATISTICS xy
),
PFUS as (
-- Product-Feature Usage Statitsics = DBA_FUS entries mapped to their corresponding database products
select
    CON_ID,
    CON_NAME,
    PRODUCT,
    NAME as FEATURE_BEING_USED,
    case  when CONDITION = 'BUG'
               --suppressed due to exceptions/defects
               then '3.SUPPRESSED_DUE_TO_BUG'
          when     detected_usages > 0                 -- some usage detection - current or past
               and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
               and CURRENT_ENTRY  = 'Y'                -- current record set
               and (    trim(CONDITION) is null        -- no extra conditions
                     or CONDITION_MET     = 'TRUE'     -- extra condition is met
                    and CONDITION_COUNTER = 'FALSE' )  -- extra condition is not based on counter
               then '6.CURRENT_USAGE'
          when     detected_usages > 0                 -- some usage detection - current or past
               and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
               and CURRENT_ENTRY  = 'Y'                -- current record set
               and (    CONDITION_MET     = 'TRUE'     -- extra condition is met
                    and CONDITION_COUNTER = 'TRUE'  )  -- extra condition is     based on counter
               then '5.PAST_OR_CURRENT_USAGE'          -- FEATURE_INFO counters indicate current or past usage
          when     detected_usages > 0                 -- some usage detection - current or past
               and (    trim(CONDITION) is null        -- no extra conditions
                     or CONDITION_MET     = 'TRUE'  )  -- extra condition is met
               then '4.PAST_USAGE'
          when CURRENT_ENTRY = 'Y'
               then '2.NO_CURRENT_USAGE'   -- detectable feature shows no current usage
          else '1.NO_PAST_USAGE'
    end as USAGE,
    LAST_SAMPLE_DATE,
    DBID            ,
    VERSION         ,
    DETECTED_USAGES ,
    TOTAL_SAMPLES   ,
    CURRENTLY_USED  ,
    case  when CONDITION like 'C___' and CONDITION_MET = 'FALSE'
               then to_date('')
          else FIRST_USAGE_DATE
    end as FIRST_USAGE_DATE,
    case  when CONDITION like 'C___' and CONDITION_MET = 'FALSE'
               then to_date('')
          else LAST_USAGE_DATE
    end as LAST_USAGE_DATE,
    EXTRA_FEATURE_INFO
from (
select m.PRODUCT, m.CONDITION, m.MVERSION,
       -- if extra conditions (coded on the MAP.CONDITION column) are required, check if entries satisfy the condition
       case
             when CONDITION = 'C001' and (   regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                          or regexp_like(to_char(FEATURE_INFO), 'compression used: *TRUE', 'i')                 )
                  then 'TRUE'  -- compression has been used
             when CONDITION = 'C002' and (   regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                          or regexp_like(to_char(FEATURE_INFO), 'encryption used: *TRUE', 'i')                  )
                  then 'TRUE'  -- encryption has been used
             when CONDITION = 'C003' and CON_ID=1 and AUX_COUNT > 1
                  then 'TRUE'  -- more than one PDB are created
             when CONDITION = 'C004' and 'N'= 'N'
                  then 'TRUE'  -- not in oracle cloud
             else 'FALSE'
       end as CONDITION_MET,
       -- check if the extra conditions are based on FEATURE_INFO counters. They indicate current or past usage.
       case
             when CONDITION = 'C001' and     regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                  then 'TRUE'  -- compression counter > 0
             when CONDITION = 'C002' and     regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                  then 'TRUE'  -- encryption counter > 0
             else 'FALSE'
       end as CONDITION_COUNTER,
       case when CONDITION = 'C001'
                 then   regexp_substr(to_char(FEATURE_INFO), 'compression used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
            when CONDITION = 'C002'
                 then   regexp_substr(to_char(FEATURE_INFO), 'encryption used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
            when CONDITION = 'C003'
                 then   'AUX_COUNT=' || AUX_COUNT
            when CONDITION = 'C004' and 'N'= 'Y'
                 then   'feature included in Oracle Cloud Services Package'
            else ''
       end as EXTRA_FEATURE_INFO,
       f.CON_ID          ,
       f.CON_NAME        ,
       f.CURRENT_ENTRY   ,
       f.NAME            ,
       f.LAST_SAMPLE_DATE,
       f.DBID            ,
       f.VERSION         ,
       f.DETECTED_USAGES ,
       f.TOTAL_SAMPLES   ,
       f.CURRENTLY_USED  ,
       f.FIRST_USAGE_DATE,
       f.LAST_USAGE_DATE ,
       f.AUX_COUNT       ,
       f.FEATURE_INFO
  from MAP m
  join FUS f on m.FEATURE = f.NAME and regexp_like(f.VERSION, m.MVERSION)
  where nvl(f.TOTAL_SAMPLES, 0) > 0                        -- ignore features that have never been sampled
)
  where nvl(CONDITION, '-') != 'INVALID'                   -- ignore features for which licensing is not required without further conditions
    and not (CONDITION = 'C003' and CON_ID not in (0, 1))  -- multiple PDBs are visible only in CDB$ROOT; PDB level view is not relevant
)
select
    grouping_id(CON_ID) as gid,
    CON_ID   ,
    decode(grouping_id(CON_ID), 1, '--ALL--', max(CON_NAME)) as CON_NAME,
    PRODUCT  ,
    decode(max(USAGE),
          '1.NO_PAST_USAGE'        , 'NO_USAGE'             ,
          '2.NO_CURRENT_USAGE'     , 'NO_USAGE'             ,
          '3.SUPPRESSED_DUE_TO_BUG', 'SUPPRESSED_DUE_TO_BUG',
          '4.PAST_USAGE'           , 'PAST_USAGE'           ,
          '5.PAST_OR_CURRENT_USAGE', 'PAST_OR_CURRENT_USAGE',
          '6.CURRENT_USAGE'        , 'CURRENT_USAGE'        ,
          'UNKNOWN') as USAGE,
    max(LAST_SAMPLE_DATE) as LAST_SAMPLE_DATE,
    min(FIRST_USAGE_DATE) as FIRST_USAGE_DATE,
    max(LAST_USAGE_DATE)  as LAST_USAGE_DATE
  from PFUS
  where USAGE in ('2.NO_CURRENT_USAGE', '4.PAST_USAGE', '5.PAST_OR_CURRENT_USAGE', '6.CURRENT_USAGE')   -- ignore '1.NO_PAST_USAGE', '3.SUPPRESSED_DUE_TO_BUG'
  group by rollup(CON_ID), PRODUCT
  having not (max(CON_ID) in (-1, 0) and grouping_id(CON_ID) = 1)            -- aggregation not needed for non-container databases
order by GID desc, CON_ID, decode(substr(PRODUCT, 1, 1), '.', 2, 1), PRODUCT );


CREATE OR REPLACE FORCE VIEW FEATURE_USAGE_DETAILS
AS
select product
     , feature_being_used
     , usage
     , last_sample_date
     , dbid
     , ( select name from v$database ) dbname
     , version
     , detected_usages
     , total_samples
     , currently_used
     , first_usage_date
     , last_usage_date
     , extra_feature_info
------- following sql is based on options_packs_usage_statistics.sql  --> MOS Note 1317265.1
from (
with
MAP as (
-- mapping between features tracked by DBA_FUS and their corresponding database products (options or packs)
select '' PRODUCT, '' feature, '' MVERSION, '' CONDITION from dual union all
SELECT 'Active Data Guard'                                   , 'Active Data Guard - Real-Time Query on Physical Standby' , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Active Data Guard'                                   , 'Global Data Services'                                    , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Analytics'                                  , 'Data Mining'                                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'ADVANCED Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Advanced Index Compression'                              , '^12\.'                      , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Backup HIGH Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup LOW Compression'                                  , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup MEDIUM Compression'                               , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Backup ZLIB Compression'                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Data Guard'                                              , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^11\.2\.0\.[1-3]\.'         , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Flashback Data Archive'                                  , '^(11\.2\.0\.[4-9]\.|12\.)'  , 'INVALID' from dual union all -- licensing required by Optimization for Flashback Data Archive
SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^11\.2|^12\.1'              , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'HeapCompression'                                         , '^12\.[2-9]'                 , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.1'                     , 'BUG'     from dual union all
SELECT 'Advanced Compression'                                , 'Heat Map'                                                , '^12\.[2-9]'                 , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Information Lifecycle Management'                        , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Advanced Network Compression Service'             , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C001'    from dual union all
SELECT 'Advanced Compression'                                , 'SecureFile Compression (user)'                           , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Compression'                                , 'SecureFile Deduplication (user)'                         , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'ASO native encryption and checksumming'                  , '^11\.2|^12\.'               , 'INVALID' from dual union all -- no longer part of Advanced Security
SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^11\.2'                     , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Backup Encryption'                                       , '^12\.'                      , 'INVALID' from dual union all -- licensing required only by encryption to disk
SELECT 'Advanced Security'                                   , 'Data Redaction'                                          , '^12\.'                      , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Encrypted Tablespaces'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Export)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
SELECT 'Advanced Security'                                   , 'Oracle Utility Datapump (Import)'                        , '^11\.2|^12\.'               , 'C002'    from dual union all
SELECT 'Advanced Security'                                   , 'SecureFile Encryption (user)'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Advanced Security'                                   , 'Transparent Data Encryption'                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Change Management Pack'                              , 'Change Management Pack'                                  , '^11\.2'                     , ' '       from dual union all
SELECT 'Configuration Management Pack for Oracle Database'   , 'EM Config Management Pack'                               , '^11\.2'                     , ' '       from dual union all
SELECT 'Data Masking Pack'                                   , 'Data Masking Pack'                                       , '^11\.2'                     , ' '       from dual union all
SELECT '.Database Gateway'                                   , 'Gateways'                                                , '^12\.'                      , ' '       from dual union all
SELECT '.Database Gateway'                                   , 'Transparent Gateway'                                     , '^12\.'                      , ' '       from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Aggregation'                                   , '^12\.'                      , ' '       from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.0'            , 'BUG'     from dual union all
SELECT 'Database In-Memory'                                  , 'In-Memory Column Store'                                  , '^12\.1\.0\.2\.[^0]|^12\.2'  , ' '       from dual union all
SELECT 'Database Vault'                                      , 'Oracle Database Vault'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Database Vault'                                      , 'Privilege Capture'                                       , '^12\.'                      , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'ADDM'                                                    , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Baseline'                                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Baseline Template'                                   , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'AWR Report'                                              , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Automatic Workload Repository'                           , '^12\.'                      , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Baseline Adaptive Thresholds'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Baseline Static Computations'                            , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'Diagnostic Pack'                                         , '^11\.2'                     , ' '       from dual union all
SELECT 'Diagnostics Pack'                                    , 'EM Performance Page'                                     , '^12\.'                      , ' '       from dual union all
SELECT '.Exadata'                                            , 'Exadata'                                                 , '^11\.2|^12\.'               , ' '       from dual union all
SELECT '.GoldenGate'                                         , 'GoldenGate'                                              , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.1'                     , 'BUG'     from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression'                             , '^12\.[2-9]'                 , ' '       from dual union all
SELECT '.HW'                                                 , 'Hybrid Columnar Compression Row Level Locking'           , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Sun ZFS with EHCC'                                       , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'ZFS Storage'                                             , '^12\.'                      , ' '       from dual union all
SELECT '.HW'                                                 , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
SELECT 'Label Security'                                      , 'Label Security'                                          , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Multitenant'                                         , 'Oracle Multitenant'                                      , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
SELECT 'Multitenant'                                         , 'Oracle Pluggable Databases'                              , '^12\.'                      , 'C003'    from dual union all -- licensing required only when more than one PDB containers are created
SELECT 'OLAP'                                                , 'OLAP - Analytic Workspaces'                              , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'OLAP'                                                , 'OLAP - Cubes'                                            , '^12\.'                      , ' '       from dual union all
SELECT 'Partitioning'                                        , 'Partitioning (user)'                                     , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Partitioning'                                        , 'Zone maps'                                               , '^12\.'                      , ' '       from dual union all
SELECT '.Pillar Storage'                                     , 'Pillar Storage'                                          , '^12\.'                      , ' '       from dual union all
SELECT '.Pillar Storage'                                     , 'Pillar Storage with EHCC'                                , '^12\.'                      , ' '       from dual union all
SELECT '.Provisioning and Patch Automation Pack'             , 'EM Standalone Provisioning and Patch Automation Pack'    , '^11\.2'                     , ' '       from dual union all
SELECT 'Provisioning and Patch Automation Pack for Database' , 'EM Database Provisioning and Patch Automation Pack'      , '^11\.2'                     , ' '       from dual union all
SELECT 'RAC or RAC One Node'                                 , 'Quality of Service Management'                           , '^12\.'                      , ' '       from dual union all
SELECT 'Real Application Clusters'                           , 'Real Application Clusters (RAC)'                         , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Real Application Clusters One Node'                  , 'Real Application Cluster One Node'                       , '^12\.'                      , ' '       from dual union all
SELECT 'Real Application Testing'                            , 'Database Replay: Workload Capture'                       , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT 'Real Application Testing'                            , 'Database Replay: Workload Replay'                        , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT 'Real Application Testing'                            , 'SQL Performance Analyzer'                                , '^11\.2|^12\.'               , 'C004'    from dual union all
SELECT '.Secure Backup'                                      , 'Oracle Secure Backup'                                    , '^12\.'                      , 'INVALID' from dual union all  -- does not differentiate usage of Oracle Secure Backup Express, which is free
SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^11\.2'                     , 'INVALID' from dual union all  -- does not differentiate usage of Locator, which is free
SELECT 'Spatial and Graph'                                   , 'Spatial'                                                 , '^12\.'                      , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'Automatic Maintenance - SQL Tuning Advisor'              , '^12\.'                      , 'INVALID' from dual union all  -- system usage in the maintenance window
SELECT 'Tuning Pack'                                         , 'Automatic SQL Tuning Advisor'                            , '^11\.2|^12\.'               , 'INVALID' from dual union all  -- system usage in the maintenance window
SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^11\.2'                     , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'Real-Time SQL Monitoring'                                , '^12\.'                      , 'INVALID' from dual union all  -- default
SELECT 'Tuning Pack'                                         , 'SQL Access Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Monitoring and Tuning pages'                         , '^12\.'                      , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Profile'                                             , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Tuning Advisor'                                      , '^11\.2|^12\.'               , ' '       from dual union all
SELECT 'Tuning Pack'                                         , 'SQL Tuning Set (user)'                                   , '^12\.'                      , 'INVALID' from dual union all -- no longer part of Tuning Pack
SELECT 'Tuning Pack'                                         , 'Tuning Pack'                                             , '^11\.2'                     , ' '       from dual union all
SELECT '.WebLogic Server Management Pack Enterprise Edition' , 'EM AS Provisioning and Patch Automation Pack'            , '^11\.2'                     , ' '       from dual union all
select '' PRODUCT, '' FEATURE, '' MVERSION, '' CONDITION from dual
),
FUS as (
-- the current data set to be used: DBA_FEATURE_USAGE_STATISTICS or CDB_FEATURE_USAGE_STATISTICS for Container Databases(CDBs)
select
    0 as CON_ID,
    NULL as CON_NAME,
    -- Detect and mark with Y the current DBA_FUS data set = Most Recent Sample based on LAST_SAMPLE_DATE
      case when DBID || '#' || VERSION || '#' || to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS') =
                first_value (DBID    )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                first_value (VERSION )         over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc) || '#' ||
                first_value (to_char(LAST_SAMPLE_DATE, 'YYYYMMDDHH24MISS'))
                                               over (partition by 0 order by LAST_SAMPLE_DATE desc nulls last, DBID desc)
           then 'Y'
           else 'N'
    end as CURRENT_ENTRY,
    NAME            ,
    LAST_SAMPLE_DATE,
    DBID            ,
    VERSION         ,
    DETECTED_USAGES ,
    TOTAL_SAMPLES   ,
    CURRENTLY_USED  ,
    FIRST_USAGE_DATE,
    LAST_USAGE_DATE ,
    AUX_COUNT       ,
    FEATURE_INFO
from DBA_FEATURE_USAGE_STATISTICS xy
),
PFUS as (
-- Product-Feature Usage Statitsics = DBA_FUS entries mapped to their corresponding database products
select
    CON_ID,
    CON_NAME,
    PRODUCT,
    NAME as FEATURE_BEING_USED,
    case  when CONDITION = 'BUG'
               --suppressed due to exceptions/defects
               then '3.SUPPRESSED_DUE_TO_BUG'
          when     detected_usages > 0                 -- some usage detection - current or past
               and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
               and CURRENT_ENTRY  = 'Y'                -- current record set
               and (    trim(CONDITION) is null        -- no extra conditions
                     or CONDITION_MET     = 'TRUE'     -- extra condition is met
                    and CONDITION_COUNTER = 'FALSE' )  -- extra condition is not based on counter
               then '6.CURRENT_USAGE'
          when     detected_usages > 0                 -- some usage detection - current or past
               and CURRENTLY_USED = 'TRUE'             -- usage at LAST_SAMPLE_DATE
               and CURRENT_ENTRY  = 'Y'                -- current record set
               and (    CONDITION_MET     = 'TRUE'     -- extra condition is met
                    and CONDITION_COUNTER = 'TRUE'  )  -- extra condition is     based on counter
               then '5.PAST_OR_CURRENT_USAGE'          -- FEATURE_INFO counters indicate current or past usage
          when     detected_usages > 0                 -- some usage detection - current or past
               and (    trim(CONDITION) is null        -- no extra conditions
                     or CONDITION_MET     = 'TRUE'  )  -- extra condition is met
               then '4.PAST_USAGE'
          when CURRENT_ENTRY = 'Y'
               then '2.NO_CURRENT_USAGE'   -- detectable feature shows no current usage
          else '1.NO_PAST_USAGE'
    end as USAGE,
    LAST_SAMPLE_DATE,
    DBID            ,
    VERSION         ,
    DETECTED_USAGES ,
    TOTAL_SAMPLES   ,
    CURRENTLY_USED  ,
    FIRST_USAGE_DATE,
    LAST_USAGE_DATE,
    EXTRA_FEATURE_INFO
from (
select m.PRODUCT, m.CONDITION, m.MVERSION,
       -- if extra conditions (coded on the MAP.CONDITION column) are required, check if entries satisfy the condition
       case
             when CONDITION = 'C001' and (   regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                          or regexp_like(to_char(FEATURE_INFO), 'compression used: *TRUE', 'i')                 )
                  then 'TRUE'  -- compression has been used
             when CONDITION = 'C002' and (   regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                                          or regexp_like(to_char(FEATURE_INFO), 'encryption used: *TRUE', 'i')                  )
                  then 'TRUE'  -- encryption has been used
             when CONDITION = 'C003' and CON_ID=1 and AUX_COUNT > 1
                  then 'TRUE'  -- more than one PDB are created
             when CONDITION = 'C004' and 'N'= 'N'
                  then 'TRUE'  -- not in oracle cloud
             else 'FALSE'
       end as CONDITION_MET,
       -- check if the extra conditions are based on FEATURE_INFO counters. They indicate current or past usage.
       case
             when CONDITION = 'C001' and     regexp_like(to_char(FEATURE_INFO), 'compression used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                  then 'TRUE'  -- compression counter > 0
             when CONDITION = 'C002' and     regexp_like(to_char(FEATURE_INFO), 'encryption used:[ 0-9]*[1-9][ 0-9]*time', 'i')
                  then 'TRUE'  -- encryption counter > 0
             else 'FALSE'
       end as CONDITION_COUNTER,
       case when CONDITION = 'C001'
                 then   regexp_substr(to_char(FEATURE_INFO), 'compression used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
            when CONDITION = 'C002'
                 then   regexp_substr(to_char(FEATURE_INFO), 'encryption used:(.*?)(times|TRUE|FALSE)', 1, 1, 'i')
            when CONDITION = 'C003'
                 then   'AUX_COUNT=' || AUX_COUNT
            when CONDITION = 'C004' and 'N'= 'Y'
                 then   'feature included in Oracle Cloud Services Package'
            else ''
       end as EXTRA_FEATURE_INFO,
       f.CON_ID          ,
       f.CON_NAME        ,
       f.CURRENT_ENTRY   ,
       f.NAME            ,
       f.LAST_SAMPLE_DATE,
       f.DBID            ,
       f.VERSION         ,
       f.DETECTED_USAGES ,
       f.TOTAL_SAMPLES   ,
       f.CURRENTLY_USED  ,
       f.FIRST_USAGE_DATE,
       f.LAST_USAGE_DATE ,
       f.AUX_COUNT       ,
       f.FEATURE_INFO
  from MAP m
  join FUS f on m.FEATURE = f.NAME and regexp_like(f.VERSION, m.MVERSION)
  where nvl(f.TOTAL_SAMPLES, 0) > 0                        -- ignore features that have never been sampled
)
  where nvl(CONDITION, '-') != 'INVALID'                   -- ignore features for which licensing is not required without further conditions
    and not (CONDITION = 'C003' and CON_ID not in (0, 1))  -- multiple PDBs are visible only in CDB$ROOT; PDB level view is not relevant
)
select
    CON_ID            ,
    CON_NAME          ,
    PRODUCT           ,
    FEATURE_BEING_USED,
    decode(USAGE,
          '1.NO_PAST_USAGE'        , 'NO_PAST_USAGE'        ,
          '2.NO_CURRENT_USAGE'     , 'NO_CURRENT_USAGE'     ,
          '3.SUPPRESSED_DUE_TO_BUG', 'SUPPRESSED_DUE_TO_BUG',
          '4.PAST_USAGE'           , 'PAST_USAGE'           ,
          '5.PAST_OR_CURRENT_USAGE', 'PAST_OR_CURRENT_USAGE',
          '6.CURRENT_USAGE'        , 'CURRENT_USAGE'        ,
          'UNKNOWN') as USAGE,
    LAST_SAMPLE_DATE  ,
    DBID              ,
    VERSION           ,
    DETECTED_USAGES   ,
    TOTAL_SAMPLES     ,
    CURRENTLY_USED    ,
    FIRST_USAGE_DATE  ,
    LAST_USAGE_DATE   ,
    EXTRA_FEATURE_INFO
  from PFUS
  where USAGE in ('2.NO_CURRENT_USAGE', '3.SUPPRESSED_DUE_TO_BUG', '4.PAST_USAGE', '5.PAST_OR_CURRENT_USAGE', '6.CURRENT_USAGE')  -- ignore '1.NO_PAST_USAGE'
order by CON_ID, decode(substr(PRODUCT, 1, 1), '.', 2, 1), PRODUCT, FEATURE_BEING_USED, LAST_SAMPLE_DATE desc, PFUS.USAGE );

The post Better track the Usage of Database Options and Management Packs, or it will cost you appeared first on AMIS Oracle and Java Blog.

↧

R: Utilizing multiple CPUs

April 22, 2017, 5:02 am

≫ Next: Smooth, easy, lightweight – Node.js and Express style REST API with Java SE

≪ Previous: Better track the Usage of Database Options and Management Packs, or it will cost you

R is a great piece of software to perform statistical analyses. Computing power can however be a limitation. R by default uses only a single CPU. In almost every machine, multiple CPUs are present, so why not utilize them? In this blog post I’ll give a minimal example and some code snippets to help make more complex examples work.

Utilizing multiple CPUs

Luckily using multiple CPUs in R is relatively simple. There is a deprecated library multicore available which you shouldn’t use. A newer library parallel is recommended. This library provides mclapply. This function only works on Linux systems so we’re not going to use that one. The below examples work on Windows and Linux and do not use deprecated libraries.

A very simple example

library(parallel)

no_cores <- detectCores() - 1
cl <- makeCluster(no_cores)
arr <- c("business","done","differently")

#Work on the future together
result <- parLapply(cl, arr, function(x) toupper(x))

#Conclusion: BUSINESS DONE DIFFERENTLY
paste (c('Conclusion:',result),collapse = ' ')

stopCluster(cl)

The example is a minimal example of how you can use clustering in R. What this code does is spawn multiple processes and process the entries from the array c(“business”,”done”,”differently”) in those separate processes. Processing in this case is just putting them in uppercase. After it is done, the result from the different processes is combined in Conclusion: BUSINESS DONE DIFFERENTLY.

If you remove the stopCluster command, you can see there are multiple processes open on my Windows machine:

After having called the stopCluster command, the number of processes if much reduced:

You can imagine that for such a simple operation as putting things in uppercase, you might as well use the regular apply function which saves you from the overhead of spawning processes. If however you have more complex operations like the below example, you will benefit greatly from being to utilize more computing power!

A more elaborate example

You can download the code of this example from: https://github.com/MaartenSmeets/R/blob/master/htmlcrawling.R

The sample however does not work anymore since it parses Yahoo pages which have recently been changed. The sample does illustrate however how to do parallel processing.

Because there are separate R processes running, you need to make libraries and functions available to these processes. For example, you can make libraries available like:

#make libraries available in other nodes
clusterEvalQ(cl, {
  library(XML)
  library(RCurl)
  library(parallel)
  }
)

And you can make functions available like

clusterExport(cl, "htmlParseFunc")

Considerations

There are several considerations (and probably more than mentioned below) when using this way of clustering:

Work packages are separated equally over CPUs. If however the work packages differ greatly in the amount of work, you can encounter situations where parLapply is waiting for a process to complete while the other processes are already done. You should try and use work packages mostly of equal size to avoid this.
If a process runs too long, it will timeout. You can set the timeout when creating the cluster like: cl <- makeCluster(no_cores, timeout=50)
Every process takes memory. If you process large variables in parallel, you might encounter memory limitations.
Debugging the different processes can be difficult. I will not go into detail here.
GPUs can also be utilized to do calculations. See for example: https://www.r-bloggers.com/r-gpu-programming-for-all-with-gpur/. I have not tried this but the performance graphs online indicate a much better performance can be achieved than when using CPUs.

The post R: Utilizing multiple CPUs appeared first on AMIS Oracle and Java Blog.

↧

Smooth, easy, lightweight – Node.js and Express style REST API with Java SE

May 4, 2017, 3:28 am

≫ Next: Golden Gate 12c and DIY Sequence Replication with PL/SQL

≪ Previous: R: Utilizing multiple CPUs

It is easy to be seduced by some of the attractive qualities of Node (aka Node.js) – the JavaScript technology that makes server side development fun again. Developing light weight applications that handle HTTP requests in a rapid, straightforward way with little overhead and no bloated infrastructure is easy as pie – and feels a long way from the traditional Java development. I like Node. I feel the attraction. I have used Node for simple and more complex applications. It’s cool.

I have realized that what is so nice about Node, is also largely available with Java. Of course, there are many ways of doing Java development that is not lightweight and rapid and low overhead at all. As I am sure we can find ways to spoil Node development. More importantly, there are ways to make Java development comparably breezy as Node development. In this article I take a brief look at the development of a REST API using nothing but the [Oracle] Java Runtime and Maven as the package manager (Java’s equivalent to Node’s npm). Using the Java 8 JDK and Maven I am able to program and run a REST API from my command line, running locally on my laptop, using under two dozen lines of code. In a way to that is very similar to what I would do with Node and the Express library. The steps described below can be executed in less than 15 minutes – similar to what Node based development of this type of REST API foundation would require.

The source code accompanying this article is in GitHub: https://github.com/lucasjellema/java-express-style-rest-api – but it is not a lot of code at all.

The final result of this article is simple: a REST API running locally that handles simple GET and POST requests. The logic of the API has to be implemented (and some JSON processing may have to be added, which granted is in Java more complex than in Node) – but that is fairly evident to do.

Here is a screenshot of Postman where the REST API is invoked:

and here is the command line for the running REST API:

The application is started with a single command line (compare to npm start) and listens on port 8765 on localhost to process incoming requests.

The steps for implementing this REST API and running it locally are described below.

Implementation of REST API

Again, the only two prerequisites for these steps are: a locally installed Oracle JDK 8 (http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and Maven 3 environment (https://maven.apache.org/download.cgi)

1. Create scaffold for new application using Maven (compare npm init)

mvn -B archetype:generate -DarchetypeGroupId=org.apache.maven.archetypes -DgroupId=nl.amis.rest -DartifactId=my-rest

2. Edit Maven’s pom.xml to add dependencies for Jersey and Jersey Container (compare package.json and npm install –save)

Note: also add build section in pom.xml with explicit indication of Java 1.8 as source and target version (to ensure Lambda expressions are supported)

3. Retrieve required libraries (jar files) using Maven (compare npm install)

mvn install dependency:copy-dependencies

This will install all required JARs into directory target\dependency – compare to node-modules in a Node application.

4. Edit Java class App to create the most simple and straightforward http request serving application conceivable – use imports for required dependencies (compare require instructions in node application)

package nl.amis.rest;

import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;

import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpServer;

public class App {
    private final static int port = 8765;

    public static void main(String[] args) throws IOException {
        HttpServer server = HttpServer.create(new InetSocketAddress(port), 0);
        server.createContext("/app", (HttpExchange t) -&gt; {
            byte[] response = "Hello World from HttpServer".getBytes();
            t.sendResponseHeaders(200, response.length);
            OutputStream os = t.getResponseBody();
            os.write(response);
            os.close();
        });
        server.setExecutor(null); // creates a default executor
        server.start();
        System.out.println("HTTP Server is running and listening at " + server.getAddress() + "/app");
    }
}

and invoke it:

5. Class App2.java builds on App2 to add the REST API capabilities – using class Api as the REST Resource (@Path Api) with the Resource Methods to handle GET and POST requests)

package nl.amis.rest;

import java.io.IOException;

import com.sun.net.httpserver.HttpServer;

import java.net.URI;

import javax.ws.rs.core.UriBuilder;

import org.glassfish.jersey.jdkhttp.JdkHttpServerFactory;
import org.glassfish.jersey.server.ResourceConfig;

public class App2 {
    private final static int port = 8765;
    private final static String host = "http://localhost/app";

    public static void main(String[] args) throws IOException {
        URI baseUri = UriBuilder.fromUri(host).port(port).build();
        ResourceConfig config = new ResourceConfig(Api.class);
        HttpServer server = JdkHttpServerFactory.createHttpServer(baseUri, config);
        System.out.println("HTTP Server is running and listening at "+baseUri+"/api" );
    }
}

and

package nl.amis.rest;

import javax.ws.rs.Consumes;
import javax.ws.rs.GET;
import javax.ws.rs.POST;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.core.Context;
import javax.ws.rs.core.Request;

@Path("api")
public class Api {

    @POST
    @Consumes("application/json")
    @Produces("text/plain")
    public String postApiMessage(@Context Request request, String json) {
        System.out.println("received event:" + json);
        return "post message received " + json;
    }

    @GET
    @Produces("text/plain")
    public String getApiMessage(@Context Request request) {
        return "nothing to report from getApiMessage.";
    }

}

6. Build application using Maven (this step does not really exist for node applications; programming errors come out at run time )

mvn package

This creates a JAR file – my-rest-1.0-SNAPSHOT.jar, 6 KB – that can be shipped, cloud deployed or simply executed (as in the next section)

7. Run application which starts the REST API at http://localhost:8765

java -cp target/my-rest-1.0-SNAPSHOT.jar;target/dependency/* nl.amis.rest.App

or

java -cp target/my-rest-1.0-SNAPSHOT.jar;target/dependency/* nl.amis.rest.App2

Resources

Get URL parameters using JDK HTTP server http://www.rgagnon.com/javadetails/java-get-url-parameters-using-jdk-http-server.html

Example of reading headers and of downloading (PDF) file through HTTP Server: http://www.rgagnon.com/javadetails/java-have-a-simple-http-server.html

The post Smooth, easy, lightweight – Node.js and Express style REST API with Java SE appeared first on AMIS Oracle and Java Blog.

↧

Golden Gate 12c and DIY Sequence Replication with PL/SQL

May 5, 2017, 8:58 am

≫ Next: Sequential Asynchronous calls in Node.JS – using callbacks, async and ES6 Promises

≪ Previous: Smooth, easy, lightweight – Node.js and Express style REST API with Java SE

Recently, while migrating AIX 11gR2 Databases to Oracle Linux 12cR1 on an ODA X5-2, our setup of Sequence Replication by Oracle Golden Gate appeared to be faulty. The target side sequences were not automatically incremented.

The problem came to light during the migration of acceptance databases, and under some time pressure it was devised to generate drop + create statements ( start with = DBA_SEQUENCES.LAST_NUMBER + DBA_SEQUENCES.INCREMENT_BY ) of all sequences in the Source, and to run these statements on the Target. Although this eventually resulted in the desired result, there were 2 side effects:

With a total of 1270 sequences, the operation as a whole took more than an hour.

Packages and triggers referencing these sequences became invalid.

Further research revealed that the Golden Gate Sequence Replication of Production suffered the same problem and I wondered if I could find a better solution with now a bit more time at hand. Well, I discovered that to set any desired sequence “currval” value, a one-time temporary adjustment of the increment and subsequent call to the sequence “nextval” pseudo column is sufficient. What follows is the output of a quick test, but check out what happens with “USER_SEQUENCES.LAST_NUMBER”, and what it really means in combination with the cache.

Create a test sequence

CREATE SEQUENCE TEST_SEQ_01
START WITH 10
INCREMENT BY 1000
MINVALUE 10
CACHE 20
NOCYCLE
NOORDER;

-- the sequence returns no current value yet
SELECT TEST_SEQ_01.CURRVAL from dual;
  ORA-08002: sequence TEST_SEQ_01.CURRVAL is not yet defined in this session.

-- check out last_number... it equals nextval because the cache doesn't exist yet
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE	LAST_NUMBER
  10	      1000	       20	        10

-- generate the first number and create the cache
SELECT TEST_SEQ_01.NEXTVAL from dual;
  NEXTVAL
  10

-- last_number is updated as the highest possible number of the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        20010

-- and now a current value is returned
SELECT TEST_SEQ_01.CURRVAL from dual;
  CURRVAL
  10

Set the current sequence value = 20000 without recreating the sequence

-- adjust the increment
ALTER SEQUENCE TEST_SEQ_01 INCREMENT BY 19990;

-- last_number equals the sequence next value
-- the last "alter sequence" command must have flushed the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      19990	       20	        20000

-- generate the next value and create a new cache
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  20000

-- last_number is updated as the highest possible number of the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      19990	       20	        419800

-- the sequence has the desired current value
SELECT TEST_SEQ_01.CURRVAL from dual
  CURRVAL
  20000

Reset the increment

-- set the increment_by value back to original
ALTER SEQUENCE TEST_SEQ_01 INCREMENT BY 1000;

-- again, the cache is flushed and last_number equals the next value
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        21000

-- generate the next value and create a new cache
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  21000

-- last_number is updated as the highest possible number of the cache
SELECT MIN_VALUE, INCREMENT_BY, CACHE_SIZE, LAST_NUMBER
FROM user_sequences
WHERE sequence_name = 'TEST_SEQ_01';
  MIN_VALUE	INCREMENT_BY CACHE_SIZE LAST_NUMBER
  10	      1000	       20	        41000

-- the increment is back to 1000
SELECT TEST_SEQ_01.NEXTVAL from dual
  NEXTVAL
  22000

This test shows that “USER_SEQUENCES.LAST_NUMBER”:

Is identical with sequence “nextval” directly after a “create sequence” or “alter sequence” command, because the cache is not there yet after first definition or gets flushed with an alter.

Is updated and saved to disk as the highest possible cache number after a call to “nextval”.

Serves as safeguard ( i.e. after a crash ) to ensure that sequence numbers do not conflict with numbers previously issued.

I decided to use “DBA_SEQUENCES.LAST_NUMBER” instead of the “currval” pseudo column to compare sequences in Source and Target. The reason is that “currval” is only ( and by definition ) the value returned by my sessions last call to “nextval”. If my session has not called “nextval” yet, “currval” is undefined. So I would have to “nextval” 1270 sequences in Source and also in Target before I could even start with the comparison, while last_numbers are already there to compare with. Also, this activity is unwanted during the short inactive Source and inactive Target migration stage and would take too much time. Last but not least, an exact match of sequence “currval” values is not really necessary… a guarantee of higher sequence “currval” values in Target compared to those in Source is quite enough.

The next short piece of code is what I eventually came up with and used in the Production migration. It took less than 3 minutes processing time, did not render any Oracle object invalid, and contributed highly to a very limited migration inactivity time.

-- Code assumes:
--   1. "nocycle" sequences with positive "increment_by" values
--   2. identical number of sequences and sequence DDL in Source and Target Database
-- Grant 'alter any sequence' and 'select any sequence' to the owner
-- Replace the database link and schema names with your own
-- Run the code from Target
declare
  v_ret PLS_INTEGER := 0;
  v_dummy VARCHAR2(100);
  v_ln number := 0;
  v_ib number := 0;
  v_cz number := 0;
  v_incr number := 0;
begin
  for i in ( select sequence_owner  so
                  , sequence_name   sn
                  , last_number     ln
                  , increment_by    ib
                  , cache_size      cz
             from dba_sequences@<DBLINK_FROM_SOURCE2TARGET>
             where sequence_owner in ('<SCHEMA01>','<SCHEMA02>','<SCHEMA03>','<SCHEMA04>') )
  loop
      select last_number
           , increment_by
           , cache_size
        into v_ln
           , v_ib
           , v_cz
      from dba_sequences
      where sequence_owner = i.so
        and sequence_name = i.sn;

-- set the difference in last_numbers as increment if target.last_number < source.last_number
      if v_ln < i.ln then
        v_incr := i.ln - v_ln;
-- set the cache as increment if last_numbers match
      elsif v_ln = i.ln then
        v_incr := v_ib * v_cz;
      end if;

      if v_ln <= i.ln then
        execute immediate 'alter sequence '||i.so||'.'||i.sn||' increment by '||v_incr;
        execute immediate 'select '||i.so||'.'||i.sn||'.nextval from dual' into v_dummy;
        execute immediate 'alter sequence '||i.so||'.'||i.sn||' increment by '||v_ib;
        v_ret := v_ret +1;
      end if;
  end loop;
  dbms_output.put_line('Nr. sequences adjusted: '||v_ret);
end;
/

The post Golden Gate 12c and DIY Sequence Replication with PL/SQL appeared first on AMIS Oracle and Java Blog.

↧