Run Automated Step-by-Step Browser Scenarios for Instruction or Demo from NodeJS applications with Playwright

December 31, 2020, 8:16 am

≫ Next: Java Agent: Rewrite Java code at runtime using Javassist

≪ Previous: Quickly Run NodeJS Demos in Vanilla Windows Sandbox featuring Scoop

TL;DR; Run demonstrations or instructions of browser actions. Allow the user to pause and skip acts, and to reset and switch scenarios. Allow the user to interact with the browser before, after and during the scenarios. The open source Playwright library is used from a custom NodeJS application in which the scenarios are defined – using a fair amount of CSS selectors and DOM manipulation.

The demonstration in this article shows three scenarios (The Netherlands, France, UK). Each country is introduced – using specific pages and sections on Wikipedia as well as through supporting sites. A callout is used to explain the scenario and each act. Balloon texts are used to further guide the user,

This screenshot shows the beginning of the scenario Tour of The Netherlands, act one. The user presses Play, the callout is shown, the balloon text is shown for the query field and Playwright starts typing the characters of “The Netherlands” in the search field.

The animation below shows the full scenario for The Netherlands – from starting the first act from the Play button. The scenario has four acts – opening, history, sports and culture. Each act is started from the play button. Acts can be skipped, The scenario can be reset (to the beginning) and other scenarios can be selected. The only user interaction here is pressing the play button to trigger each act. However: the user could be browsing Wikipedia between the acts – the browser session is freely available to the user.

Note: this demo was created with vanilla Wikipedia. The unchanged site was loaded in the Chromium browser instantiated through Playwright. All further manipulation was applied from the NodeJS application.

An act can use Playwright from NodeJS for manipulation of the website – and it also has full access to the browser DOM and the JavaScript context. An act can for example: fill in fields, highlight text, press buttons, scroll the page, open links, make selections, hover over elements, switch between tabs.

All sources for this example are available on GitHub: https://github.com/lucasjellema/playwright-scenarios/tree/main/step-by-step. Note: this article is intended as inspiration, to show you what is possible with Playwright. It is certainly not a ready to use solution or a great example of professional clean coding. Please interpret the right way for what it was intended for. And please share your ideas. What do you think of what I have described? Does it make sense? Can you see a way of applying it yourself? Do you have suggestions for me? Please let me know!

Introduction

In the last few weeks, I have done many interesting things with Playwright. The option of programmatically interacting with a browser running any web application or web site is powerful. It opens up many opportunities for automated browser actions, both headless (for testing, RPA, integration, screen scraping, health checks, automated reporting) and headful (prototyping, instructions, demonstrations, deeplink shortcuts, customized SaaS).

I have described some of the things I have achieved using Playwright in several earlier articles. These include: injecting a shortcut key handler into any web application running in the Playwright embedded browser (for example for taking screenshots or downloading all images), adding a toolbar into any web page, creating an API for working with WhatsApp on top of the WhatsApp Web UI, create a translation REST API on top of Google Translate, create Deepmark Booklinks that navigate into a fully initialized context in a SaaS web application, retrieve JSON reports for movies from IMDb.

The next challenge I had identified – and ended up discussing in this article – : using Playwright, I want to create a demonstration of a web application or web site. Playwright commands are used to perform the actions in the web page. What is special, is that these actions are to be grouped in steps (aka acts). Together, these steps form a scenario. Using the toolbar – the user can play an act, pause execution, skip (an act), reset (return execution to the first act). In between the execution of the automated steps, the user can interact freely with the web page. The demo steps (aka acts) do not have to be executed or they do not need to be the only thing that is done. Each step can have a title and a description that could be shown in a callout. The figure shows a callout that describes the current act and scene.

Additionally, associated with steps in an act (aka scenes) can be a a text balloon or an arrow with text – that can be positioned on the page near the element that is manipulated in the step. An example is shown in the figure.

In this way, demo scripts or tutorials can be created that take a user by the hand in a live web environment. The user can choose to let the prepared scenario play out or to intervene, contribute or event take over (in part).

I have created multiple scenarios for Wikipedia (Netherlands, France, UK) and the user can choose a scenario to execute. At any point, the user can decide to run a different scenario.

Implementation

At the heart of the implementation is a very simple piece of code that leverages the Playwright library to start a browser, create a browser context and open a web page.

Step 1 invokes a function to add the toolbar to every page in the browser context, every time after the page has navigated or reloaded and the DOM has been rebuilt. This toolbar has controls to run a scenario (as well as to switch between scenarios, pause execution, reset a scenario). The toolbar is passed the director function; this function handles toolbar events regarding scenario execution.

Step 2 is for injecting the callout object to the current page. This done through direct manipulation of the DOM in function injectCalloutIntoPage. The title and description of the initial scenario are passed to display in the callout.

The scenarioStatus object contains all scenarios and keep track of which scenario is the current one and how far along in that scenario the user has come. The pause state is also recorded in this object.

The scenarios are defined with a title and description and nested steps or acts. Each act also has a title and a description as well as an action. The action is a function that is invoked when the act is executed. This function manipulates the callout, bubble or balloon help and the browser UI controls, DOM elements and JavaScript context. Here is a small example of the NL scenario:

define title and description for the scenario
define the array with scenes (aka acts) in the scenario; each act has a title and description – these are displayed in the callout
each act has an action. this action is server side JavaScript function (NodeJS context) that receives the current (Playwright) page object as input; frequently this function will evaluate selector expressions and JavaScript statements in the context of the web page inside the browser. The first action in this example writes a balloon help text, types the string “”The Netherlands” into the search field and presses the search button. When the page has reloaded – with the details for The Netherlands – a fresh balloon help text is displayed. Note: the calls to waitForUnpause() are made to verify of the user has paused execution; these calls will block when this is the case and until the pause is ended
In the second action is an example of calling scrollToElement – a custom NodeJS function – to scroll the page to the element handle republic that was retrieved using function page.$() and the selector for a link element with specific title attribute value. This scrolling is performed in the browser and in a smooth way so the user sees the scrolling happening.

The function scrollToElement is simple enough:

This function uses the Playwright function page.evaluate() to pass the DOM element for a specific element handle into a JavaScript function that is executed in the context of the browser. The function leverages the DOM Element function scrollIntoView() to do the hard work.

Another example of what the scenarios can do is highlight text. This too turns out to be fairly simple. A link is located with a specific text content (regarding Max Verstappen, hence the element name maxText). Subsequently, a browser side snippet is executed that takes the <P> parent of this link, wraps its entire innerHTML in <mark> tags and scrolls this <P> element into view.

It is a little bit crude. Working with Range and Selection objects is the more precise approach. However, it does the trick for me in this example:

The director() function is at the heart of things: it handles the toolbar events regarding the scenarios (play, reset, skip, pause,switch). These events are captured in the browser context and passed to the NodeJS context – from the onclick event handlers on the toolbar links. This statement creates the bridge from Browser back to NodeJS:

The call to context.exposeBinding ensures that NodeJS function director will be available anywhere in the browser on the window object as directorFunction. This function is invoked from the onclick handlers in the toolbar.

And now for the director() function itself. It receives the source object – which contains the page and browser context – from Playwright and the instruction input parameter from the onclick handler, to indicate which action was triggered (next aka play, skip, reset, pause, switch).

Depending on the value of instruction, the function will manipulate the scenarioStatus object that keeps track of the current scenario and its status (next act, paused or not). For skip for example, next act is simply incremented. Pause means either pause or unpause (it is used as a toggle) and does nothing but manipulate a flag in scenarioStatus. Perhaps I should add a visual indication as well. Reset means resetting the next act to 0 or the beginning of the scenario. Switch is interpreted as select the next scenario. The call to populateCallOut() is made to synchronize the callout with the current scenario and its next act that is coming up.

Finally next aka play is the trigger for executing the action for an act meaning invoking its function. The director cannot stop an action once it is executing. However, the action itself may check with the scenarioPaused status and honor it by waiting for a pause to be concluded.

Resources

All code for this article is on GitHub: https://github.com/lucasjellema/playwright-scenarios/tree/main/step-by-step

CodePen On Speech Bubbles by @RajRajeshDn – https://codepen.io/RajRajeshDn/pen/oZdRJw

Article on Advanced Position in CSS: https://www.internetingishard.com/html-and-css/advanced-positioning/

CSS – Scroll Behavior – https://developer.mozilla.org/en-US/docs/Web/CSS/scroll-behavior

StackOverflow – get X and Y coordinates of DOM element – https://stackoverflow.com/questions/442404/retrieve-the-position-x-y-of-an-html-element-relative-to-the-browser-window

W3 Schools – how create Callout Element – https://www.w3schools.com/howto/howto_js_callout.asp

Highlight Searched text on a page with just Javascript by kapeel kokane – https://dev.to/comscience/highlight-searched-text-on-a-page-with-just-javascript-17b3

Playwright 1.7.0 new selectors – for CSS text selectors – https://github.com/microsoft/playwright/blob/v1.7.0/docs/selectors.md#css-extension-visible

Playwright Documentation:

Page – bringToFront() – https://playwright.dev/docs/api/class-page/#pagebringtofront
Page – type() – https://playwright.dev/docs/api/class-page/#pagetypeselector-text-options and https://playwright.dev/docs/input#type-characters
Page – addStyleTag() – https://playwright.dev/docs/api/class-page/#pageaddstyletagstyle
Page – addScriptTag() – https://playwright.dev/docs/api/class-page/#pageaddscripttagscript
Page – hover() – https://playwright.dev/docs/api/class-page#pagehoverselector-options
CSS and text selector – https://playwright.dev/docs/selectors#text-and-textlight
Page – wait and waitForSelector() – https://playwright.dev/docs/navigations#custom-wait-1

The post Run Automated Step-by-Step Browser Scenarios for Instruction or Demo from NodeJS applications with Playwright appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Java Agent: Rewrite Java code at runtime using Javassist

February 17, 2021, 5:16 am

≫ Next: SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join

≪ Previous: Run Automated Step-by-Step Browser Scenarios for Instruction or Demo from NodeJS applications with Playwright

You might want to change the behavior of a Java application at runtime without having to alter the original sources and recompile them. This can be done by using a Java Agent. Java Agents are used by several products. Dynatrace uses for example a Java Agent to collect data from inside the JVM. Another example is the GraalVM tracing agent (here) which helps you create configuration for the generation of native images. Analyzing runtime behavior is one use-case but you can also more dramatically alter runtime code to obtain a completely different behavior.

This blog post is not a step by step introduction for creating Java Agents. For that please take a look at the following. In this blog post I have created a Java Agent which rewrites synchronized methods to use a ReentrantLock instead (see here). The use-case for this is to allow applications to use Project Loom’s Virtual Threads more efficiently.

You can find the code of the agent here.

The Java Agent

When I created the agent, I encountered several challenges, which I listed below and how I dealt with them. This might help you to overcome some initial pitfalls when writing your own Java Agent.

Keep the dependencies to a minimum

First, the JAR for the agent must include its dependencies. You often do not know in which context the agent is loaded and if required dependencies are supplied by another application. The maven-dependency-plugin with the descriptor reference jar-with-dependencies helps you generate a JAR file including dependencies.

When you include dependencies however in the JAR containing also your agent, they might cause issues with other classes which are loaded by the JVM (because you usually run an application or even application server). The specific challenge which I encountered was that I was using slf4j-simple and my application was using Logback for logging. The solution was simple; remove slf4-simple.

More generally; use as few external dependencies as viable for the agent. I used only javassist. I needed this dependency to easily do bytecode manipulation.

Trigger class transformation for every class

There are several solutions available to obtain the correct bytecode. Be careful though that your bytecode transformer is triggered for every class that is loaded when it is loaded. Solutions which only look at currently loaded classes (instrumentation.getAllLoadedClasses()) or a scan based on reflection utils like javassist (example here) won’t do that. You can register a transformer class that will be triggered when a specific class is loaded, but if you want your agent to be triggered for every class, it is better to register a transformer without having a specific class specified; instrumentation.addTransformer(new RemsyncTransformer());. The agent will be the first to be loaded when using the premain method / the javaagent JVM switch when starting your application.

Of course, if you want to attach the agent to an already running JVM (using the agentmain method), you still need to process the already loaded classes.

Obtain the bytecode

I’ve seen several examples on how to obtain the bytecode of the class to edit in the transformer class. The signature of the transform method of transformer class is:

 public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined,  
               ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException

It is tempting to use the loader, className and classBeingRedefined to obtain the relevant CtClass instance. The CtClass instance can be edited using javaassist. For example by doing something like:

 String targetClassName = className.replaceAll("\\.", "/");  
 ClassPool cp = ClassPool.getDefault();  
 CtClass cc = cp.get(targetClassName);

I’m however not a great fan of string parsing / editing to obtain the correct name of a class and then fetch it. The classfileBuffer contains the bytecode of the class being loaded and this can be edited directly. I implemented this like below (based on this which contains a more elaborate explanation).

  private final ScopedClassPoolFactoryImpl scopedClassPoolFactory = new ScopedClassPoolFactoryImpl();  
   
  ClassPool classPool = scopedClassPoolFactory.create(loader, ClassPool.getDefault(),ScopedClassPoolRepositoryImpl.getInstance());  
  CtClass ctClass = classPool.makeClass(new ByteArrayInputStream(classfileBuffer));

Editing the bytecode

I used the following code to add a private instance variable, remove the synchronized modifier and wrap the contents of the method in a try/finally block:

 try {  
   ctField = ctClass.getDeclaredField("lockCustomAgent");  
 } catch (NotFoundException e) {  
   ctClass.addField(CtField.make("final java.util.concurrent.locks.Lock lockCustomAgent = new java.util.concurrent.locks.ReentrantLock();", ctClass));  
 }  
   
 method.instrument(new ExprEditor() {  
   public void edit(MethodCall m) throws CannotCompileException {  
     m.replace("{ lockCustomAgent.lock(); try { $_ = $proceed($$); } finally { lockCustomAgent.unlock(); } }");  
   }  
 });  
   
 modifier = Modifier.clear(modifier, Modifier.SYNCHRONIZED);  
 method.setModifiers(modifier);

What this did was change the following Java code

   synchronized void hi() {  
     System.out.println("Hi");  
   }

to the following

   final java.util.concurrent.locks.Lock lockCustomAgent = new java.util.concurrent.locks.ReentrantLock();  
   
   void hi() {  
    lockCustomAgentStatic.lock();  
    try {   
      System.out.println("Hi");  
    } finally {   
      lockCustomAgentStatic.unlock();   
    }  
   }

This is exactly as described here at ‘Mitigating limitations’.

Generate a JAR manifest

A manifest file helps the JVM to know which agent classes to use inside the JAR file. You can supply this file and package it together with the compiled sources in the JAR or let it be generated upon build. I chose the second option. The following section in a pom.xml file will generate a manifest indicating which agent class needs to be started by the JVM.

       <plugin>  
         <artifactId>maven-assembly-plugin</artifactId>  
         <version>3.3.0</version>  
         <executions>  
           <execution>  
             <phase>package</phase>  
             <goals>  
               <goal>single</goal>  
             </goals>  
           </execution>  
         </executions>  
         <configuration>  
           <descriptorRefs>  
             <descriptorRef>jar-with-dependencies</descriptorRef>  
           </descriptorRefs>  
           <archive>  
             <manifestEntries>  
               <Premain-Class>nl.amis.smeetsm.agent.RemsyncInstrumentationAgent</Premain-Class>  
               <Agent-Class>nl.amis.smeetsm.agent.RemsyncInstrumentationAgent</Agent-Class>  
               <Can-Redefine-Classes>true</Can-Redefine-Classes>  
               <Can-Retransform-Classes>true</Can-Retransform-Classes>  
             </manifestEntries>  
           </archive>  
         </configuration>  
       </plugin>

Finally

The agent described here illustrated some of the options you have for editing bytecode. I noticed though when creating this agent, that there are many situations where the agent cannot correctly rewrite code. This is probably because my rewrite code is not elaborate enough to deal with all code constructs it may encounter. An agent is probably only suitable for simple changes and not for complex rewrites. One of the strengths of using a method like this is though that you do not need to change any source code to check if a certain change does what you expect. I was surprised at how easy it was to create a working agent that changes Java code at runtime. If you have such a use-case, do check out what you can do with this. PS for Project Loom, just using virtual threads and rewriting synchronized modifiers did not give me better performance at high concurrency.

You can find the code of the agent here.

The post Java Agent: Rewrite Java code at runtime using Javassist appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join

March 3, 2021, 3:04 am

≫ Next: Java Security: Open Source tools for use in CI/CD pipelines

≪ Previous: Java Agent: Rewrite Java code at runtime using Javassist

A nice SQL challenge was presented to me by a colleague. The challenges basically consisted of this table. A table contains records that describe login events. Each record has a login timestamp and the identifier of the person logging in. The challenge is to count “unique” login events. These have been defined as unique, non-overlapping two hour periods in which a person has logged in. Such as period starts with a login that is at least two hours after the start of a previous period for that same person.

Visually this can be shown like this:

All the records marked with a star are counted. All other records are within 2 hours from a starred record and are therefore excluded because they are considered duplicate logins.

EDIT 4th March 2021: Solution

You may have read this article before – when it read a litle differently. I have made substantial changes to the article because it originally contained an incorrect solution. A solution that I did not test properly (!) and that I had not peer reviewed – before publishing it for all to see. One person who saw, read and carefully checked the article was Iudith Mentzel (see the first comment) and I owe them a great debt. They gently pointed out my flawed logic and went so far as to provide a correct solution. And not just one solution – but three.

You are now reading a rewrite of the original article – based on Iudith’s solutions for which I can take no credit (apart from having been the inspiration perhaps). I want to thank Iudith for their wonderful help and we gratefully make use of the first solution provided.

The three great solutions provided by Iudith are available in this Live SQL environment to try out.

The solution I like best uses the SQL MATCH_RECOGNIZE clause – not very well known but worthy of your close attention. It is a gem.

A very brief explanation of the logic in this SQL statement:

At the heart of this query is the regular expression pattern (block_start same_block*). This expression can be interpreted as “collect all rows from a starting record with as many rows as possible that satisfy same_block“; a row satisfies same_block if it is within two hours of the login_time of the block_start. The first subsequent row outside the match collection is the block_start for the next match. For each match, we return only one record – with the earliest login time for that block. The overall query returns records per personid and for each personid the first login_time of a block of two hours.

select personid
,      to_char(login_time,'HH24:MI') block_start_login_time
from logins
match_recognize (
  partition by personid -- regard every person independently
  order by login_time
  measures 
         match_number()     mn,
         block_start.login_time    as login_time  
  one row per match -- every block is represented by a single record 
  pattern (block_start same_block*) -- collect a log in record and all subsequent rows in same two hour block; once a row is assigned to a block it cannot start or be part of another block  
  define
       same_block as (login_time <= block_start.login_time + interval '2' hour)  -- a login row is in the same block with its subsequent rows within 2 hours
)

Using Recursive Subquery

A second solution provided by Iudith uses Recursive Subquery. This mechanism is also not widely used. It is the ANSI compliant and more powerful successor to Oracle proprietary CONNECT BY.

The starting rows are the first logins for every personid and specify both login_time and the end of the two hour block that starts at that login_time. This is the level 1 iteration of the recursive subquery, the root-blockstarters. The next iteration finds all rows that are outside the initial block for a person and determines their rank when ordered by login time. Sloppily put: finds the earliest login that is not in the two hour block started by the first round of root block starters.

Each subsequent iteration does the same thing: it finds the records that start two hour or more after the block starter (rn=1) returned for a personid in the previous iteration.

When all recursive iterations are done, we need to make sure that only the rn =1 records are retained – the blockstarters.

select personid
,      to_char(login_time,'HH24:MI')  block_start_login_time
from   ( select *
         from   logins
         model
         partition by (personid)
         dimension by (row_number() over ( partition by personid
                                           order by login_time
                                         ) as rn   
                      )
         measures ( login_time
                  , 'N' blockstarter_yn
                  )
         rules ( blockstarter_yn[any] -- assign a value to the cell count_yn for all login_times 
                 order by rn   -- go through the cells ordered by login time, starting with the earliest
                             = case
                               when cv(rn) = 1 
                               then 'Y'  -- first login by a person
                               when login_time[cv()] > -- when the login time in a cell is more than two hours later later than the last (or max) 
                                                       -- login time in an earlier cell (for a row that was identified as a block starter  
                                    max( case when blockstarter_yn = 'Y'
                                              then login_time
                                         end
                                       ) [rn < cv()] 
                                       + interval '2' hour     -- when the login_time 
                               then 'Y'
                               else 'N'
                               end 
               )
       )
where blockstarter_yn = 'Y'

The result in Live SQL with the iteration that produced the record is shown below:

Using the Model Clause

The third solution Iudith came up with is in my eyes the most cryptic one. I am sure though that this is a very personal opinion and for some of you this is the obvious choice. The query is shown below – with highlighted the essential cell assignment. Here the value of the cell blockstarter_yn is assigned – Y for a row that is the first login for a person after at least 2 hours after a previous blockstarting login and N for a row that is not a blockstarter.

Because the value of this cell for a row depends on values assigned to its predecessors when ordered by login_time, the definition of the rn dimension and the order by rn is important.

The actual cell value is derived in the case expression, that can be read as follows:

if rn is 1, we are dealing with a root blockstarter – the first login record for a personid and therefore the value returned is Y
if the login_time is more than two hours later than the last login_time for all previous records, then is row starts a new block and the value Y should be produced
else: this row is no a block starter and value N is assigned to the cell

The model clause uses partition by (personid) to determine blockstarters per person. The measures (results) produced from the model clause are the login_time and the new cell blockstarter_yn (note that the placeholder value for empty cells is N). The outer query filters the results on blockstarter=’Y’ – for obvious reasons.

select personid
,      to_char(login_time,'HH24:MI')  block_start_login_time
from   ( select *
         from   logins
         model
         partition by (personid)
         dimension by (row_number() over ( partition by personid
                                           order by login_time
                                         ) as rn   
                      )
         measures ( login_time
                  , 'N' blockstarter_yn
                  )
         rules ( blockstarter_yn[any] -- assign a value to the cell count_yn for all login_times 
                 order by rn   -- go through the cells ordered by login time, starting with the earliest
                             = case
                               when cv(rn) = 1 
                               then 'Y'  -- first login by a person
                               when login_time[cv()] > -- when the login time in a cell is more than two hours later later than the last (or max) 
                                                       -- login time in an earlier cell (for a row that was identified as a block starter  
                                    max( case when blockstarter_yn = 'Y'
                                              then login_time
                                         end
                                       ) [rn < cv()] 
                                       + interval '2' hour     -- when the login_time 
                               then 'Y'
                               else 'N'
                               end 
               )
       )
where blockstarter_yn = 'Y'

My original, faulty approach

I followed the following logic while constructing a solution for this challenge:

For each login record, find the first login record of the same person in the period of two hours prior to the record; that can be the record itself, but could also be an earlier login record
All records that are the first in a two-hour period are “block starters”, records that start a two-hour period in which we can ignore all subsequent logins for that same person (these block starters are definitely to be included in the final result)
Find all records (we call them duplicate_logins) that fall into a two hour login block for a person (a two-hour period from the login time of a block starter); these are logins that we should not count
Select all records that are not a duplicate_login – these are records that may have a preceding login in the two hours before their login timestamp but that login is within a two hour block and they are not. Note: since the block_starters are also counted as duplicate_login, they must be added separately to the result
EDIT: This logic is faulty: two (or more) records that are both outside the two hour range from a block starter can be within a two hour range of each other; my solution incorrectly would include both records instead of only the first of these two

Each bullet is implemented in the query with an inline view:

earliest_logins: For each login record, find the first login record of the same person in the period of two hours prior to the record; that can be the record itself, but could also be an earlier login record

block_starters: All records that are the first in a two-hour period are “block starters”, records that start a two-hour period in which we can ignore all subsequent logins for that same person (these block starters are definitely to be included in the final result)

duplicate_logins: Find all records (we call them duplicate_logins) that fall into a two hour login block for a person (a two-hour period from the login time of a block starter); these are logins that we should not count

and finally:

Select all records that are not a duplicate_login – these are records that may have a preceding login in the two hours before their login timestamp but that login is within a two hour block and they are not. Note: since the block_starters are also counted as duplicate_login, they must be added separately to the result

EDIT: This logic is faulty: two (or more) records that are both outside the two hour range from a block starter can be within a two hour range of each other; my solution incorrectly would include both records instead of only the first of these two

The last step that should be added is of course the count operation (select l.personid, count(login_time) from final group by personid).

Overall query and (faulty) result (Note: Person 1, Login Time 03:00 should not have been included because of the Person 1, Login Time 02:39 result)

This result is for this set of records:

You can follow along with this challenge in the live database environment offered by LiveSQL at this link.

Resources

The three great solutions provided by Iudith are available in this Live SQL environment to try out.

Anti Join in SQL – my earlier anti search pattern exploration https://technology.amis.nl/it/anti-search-patterns-sql-to-look-for-what-is-not-there-part-one/

select personid , to_char(login_time,’HH24:MI’) block_start_login_time from logins match_recognize ( partition by personid — regard every person independently order by login_time measures match_number() mn, block_start.login_time as login_time one row per match — every block is represented by a single record pattern (block_start same_block*) — collect a log in record and all subsequent rows in same two hour block; once a row is assigned to a block it cannot start or be part of another block define same_block as (login_time <= block_start.login_time + interval ‘2’ hour) — a login row is in the same block with its subsequent rows within 2 hours )

The post SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Java Security: Open Source tools for use in CI/CD pipelines

March 4, 2021, 8:35 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

≪ Previous: SQL–Only Counting Records Sufficiently Spaced apart using Analytics with Windowing Clause and Anti Join

It is often expected of a DevOps team to also take security into consideration when delivering software. Often however, this does not get the attention it deserves. In this blog post I’ll describe some easy to use, CI/CD pipeline friendly, open source tools you can use to perform several checks during your Java software delivery process which will help you identify and fix issues with minimal effort.

You can view my sample project here which implements all these tools. There is also a docker-compose.yml file supplied. SonarQube en Jenkins however do not come preconfigured in this setup. You can look at the Jenkinsfile (pipeline definition) to see what Jenkins configuration is required (installing plugins and creating credentials).

This is provided as an overview and small example only. It is not meant as a walkthrough to setup a CI/CD environment or a manual how to use the different tools. Also the mentioned tools are not all the tools which are available but a set of tools which are freely available, popular and which I managed to get working without too much effort. Using them together allows you to improve several aspects of the security of your Java applications during your software delivery process and provide quick feedback to developers. This can also help increase security awareness. When you have some experience with these tools you can implement more strict policies (which can let a build fail) and quality gates.

Static code analysis

What does it help you solve?

Static application security testing (SAST) helps you identify several issues in your source code by looking at the code itself before compilation. Some issues which static analysis can help prevent are;

Exposed resources
Static analysis tools can help detect exposed resources. These might be vulnerable to attacks.
Hardcoded keys or credentials
If keys or passwords are hardcoded inside your application, attackers might be able to extract them by for example decompiling the Java code
Stability issues
This can be related to memory leaks such as not closing resources which will no longer be used. This can also be related to performance. Bad performance can lead to stability issues. An attacker might purposefully try to bring your application down by abusing performance issues.

Which tools can you use?

When talking about Java applications, the following tools are free to use to perform static analysis.

PMD (here)
PMD can be used as a Maven or Gradle plugin (and probably also in other ways). It has a Jenkins plug-in available and the SonarQube Maven plugin can be supplied with parameters to send PMD reports to SonarQube. PMD can find a small number of actual security vulnerabilities. It can provide quite a lot of performance related suggestions.
SpotBugs (here)
SpotBugs is the spiritual successor of FindBugs. It can (like PMD) easily be integrated in builds and CI/CD pipelines. The SonarQube Maven plugin also supports sending PMD reports to SonarQube without additional plugin requirements. SpotBugs can find bugs in the area of security and performance among others (here).
FindSecBugs (here)
This is an extension of SpotBugs and can be used similarly. It covers the OWASP TOP 10 and detects /classifies several security related programming bugs.

In this example I did not use a Dockerfile but Dockerfiles can also not conform to best practices making them vulnerable. If you want to also check Dockerfiles, you can take a look at hadolint.

Dynamic Application Security Testing (DAST)

Some things cannot easily be automatically identified from within the code. For example, how the application looks on the outside. For this it is not unusual to perform a penetration test. Penetration tests can luckily be automated!

What does it help you solve?

Scan web applications and analyse used headers
Find exposed endpoints and security features of those endpoints
Check OpenAPI / SOAP / Websocket communication/services
Check traffic between browser and service

Which tools can you use?

OWASP ZAP (Zed Attack Proxy)
Can run in a standalone container, started from a Maven build using a plugin, configured as a tool within Jenkins. You can install a plugin (here) in SonarQube to visualize results and use the HTML publisher plugin in Jenkins to show the results there. OWASP ZAP can also be used together with Selenium to passively scan traffic between the browser and the service or actively modify requests.

Dependency analyses

What does it help you solve?

Java programs use different libraries to build and run. You can think of frameworks, drivers, utility code and many other re-usable assets. Those dependencies can contain vulnerabilities. Usually for older libraries, more vulnerabilities are known. Knowing about vulnerabilities in specific versions of libraries and their severity, can help you in prioritizing updating them.

Which tools can you use?

OWASP Dependency Check (here)
Can run from a Maven build and supply data to Jenkins and SonarQube (by installing the Dependency-Check plugin). Also records CVSS scores (Common Vulnerability Scoring System) for risk assessments.

Container image vulnerability scanning

What does it help you solve?

A Java program can run in a container on a container platform. A container is usually created by using a base image and putting an application inside. The base image contains operating system libraries or other software (such as a JVM) which can contain vulnerabilities.

Which tools can you use?

Anchore Engine (here).
Anchore Engine is available as a container image. Once started, it will download information from various vulnerability databases. You can request a certain image to be analyzed. It has good integration with Jenkins, however not with SonarQube. Read my blog posts about using Anchore Engine here and here. You can get notifications when new vulnerabilities are discovered for already scanned containers which can be quite useful.

Putting it all together

In the below screenshot you can see the results from a sample Spring Boot REST service which you can find here. You can see results from SpotBugs/find-sec-bugs, PMD, OWASP ZAP and the OWASP Dependency-Check in SonarQube grouped together in a single view.

You can also find the different reports (PMD, OWASP ZAP, OWASP Dependency-Check, SpotBugs/find-sec-bugs) and also the Anchore Engine scan in Jenkins.

And you can open them to view details, for example from the Dependency-Check, SpotBugs and Anchore Engine below.

I recommend defining the plugins in a parent project and let every project inherit their configuration. This allows you to easily change it for many projects at once (when they upgrade to the latest version of the parent project). Letting the build fail when a certain level of vulnerabilities are discovered is also a good thing to do. Do not set this too strict or it will delay your development efforts significantly!

Quality gates can be defined centrally in SonarQube. See for example her. For Anchore Engine this is not easily possible however, since it lacks SonarQube integration. Instead it provides its own policy engine (and GUI in the commercial version to configure those policies). See for example here.

Finally

By using the above listed tools together for a Java application, you can quite easily report on issues in various areas related to security. Also they provide options for letting a build fail if certain rules are not kept. This will force people to fix security issues before going to production instead of having to play the blame-game when a security incident occurs. If you want to accept certain vulnerabilities, policies can also be configured for the various tools and on a higher level, Quality Gates can be defined.
These kind of automated tests take away a lot of manual work and help increase security awareness. They are however not as smart as to detect every issue. They focus on technical security aspects and not so much on functional ones. Peer reviews and for example privacy acceptance testing can help cover those.

These tools can also report too much and cause information overload. Try for example running them for the first time against a large application. It might take significant effort to evaluate, prioritize and solve every issue. Probably you should focus first on the most important ones like the ones reported as blockers or high severity. After that you can also prevent new high severity issues from arising and slowly work on the technical debt which is left.

There is always a balance between focus on non-functionals like security and functionals like business logic. Security is a topic to easily get lost in and it is important to always try and think about the actual risk involved which is specific to your application, your customer and the data involved. Does the investment for solving the issue outweigh the risk involved and possible damages? If the balance tips to functionals, you might be in danger, often due to poor life-cycle management, thus it is important to explain in business terms why certain things should deserve attention. This can be a challenge since the issues are usually described in terms which can be difficult to understand for business people. If the balance tips to non-functionals, you might get lost in detail trying to solve things which are only dangerous in theory but in practice can be ignored.

The post Java Security: Open Source tools for use in CI/CD pipelines appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

February 26, 2021, 5:03 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

≪ Previous: Java Security: Open Source tools for use in CI/CD pipelines

DevOps — Kharnagy, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”, I gave you an introduction.

This time I will elaborate on the database structure.

Project folder layout

The following top level directories may exist for every database application project:

Directory	Description
apex	APEX directory containing the exported APEX application and other source files like Javascript and so on.
conf	Configuration scripts for tools like Maven or Flyway.
db	Database scripts.
ddl	DDL scripts generated by SQL Developer Data Modeler.
doc	Documentation.

Directory apex

When you invoke on Windows:

tree /A apex

you will get something like this (items between ‘${‘ and ‘}’ are variables):

\---f${application id}
    \---application
        +---pages
        +---shared_components
        |   +---files
        |   +---globalization
        |   +---logic
        |   +---navigation
        |   |   +---breadcrumbs
        |   |   +---lists
        |   |   \---tabs
        |   +---plugins
        |   |   +---dynamic_action
        |   |   \---item_type
        |   +---security
        |   |   +---app_access_control
        |   |   +---authentications
        |   |   \---authorizations
        |   \---user_interface
        |       +---lovs
        |       +---shortcuts
        |       \---templates
        |           +---breadcrumb
        |           +---button
        |           +---calendar
        |           +---label
        |           +---list
        |           +---page
        |           +---region
        |           \---report
        \---user_interfaces

Directory db

tree /A db

returns:

+---${support schema}
|   \---src
|       +---dml
|       +---full
|       \---incr
+---${API schema}
|   \---src
|       +---admin
|       +---dml
|       +---full
|       +---incr
|       \---util
+---${DATA schema}
|   \---src
|       +---admin
|       +---dml
|       +---full
|       \---incr
+---${EXT schema}
|   +---src
|   |   +---admin
|   |   +---dml
|   |   +---full
|   |   \---incr
|   \---test
\---${UI schema}
    \---src
        +---admin
        +---dml
        +---full
        \---incr

Some explanation:

the support schema directory contains scripts that let the application work with support software. So you can think of another schema that contains generic error packages or packages to manipulate APEX text messages and in this directory you will add the necessary grant scripts. The support software itself is maintained by another project.
the admin directories contain scripts to setup a schema by a DBA.
dml directories contain scripts to change reference data, for instance APEX text messages or a list of countries.
full directories contain (repeatable) Flyway database migration scripts that are run every time they change (for a database). They are meant for database objects that can be replaced (CREATE OR REPLACE ...).
incr directories contain (incremental) Flyway database migration scripts that will run only once (for a database), so for objects that can not be replaced like tables and constraints or for dropping objects.

Later on in the Flyway post, I will explain in more detail the naming conventions for the Flyway migration scripts.

DATA schema

This is the schema that contains the data: the tables and all objects needed to maintain the data logic. You may decide to put data logic packages in the API layer but that is up to you.

It is up to you how to create the database migration scripts for this layer but I suggest that you do NOT maintain them in SQL Developer Data Modeler since that tool is best used to initially setup the table structure and that is all. You may use it later on to generate incremental scripts but it is much easier to just use your favorite editor to modify these scripts.

Later on in the SQL Developer Data Modeler post, I will explain in more detail what scripts to create and where to save it in your project.

API schema

This is the schema that contains the business logic. It may contain data logic packages if you do not want to have packages in the data layer.

Usually these scripts are not generated by SQL Developer Data Modeler.

UI schema

All User Interface logic. This means that this schema will be the parsing schema for APEX. Please note that you can have more than one parsing schema per APEX workspace so there is no problem having several applications with different parsing schemas in a workspace.

Usually these scripts are not generated by SQL Developer Data Modeler.

EXT schema

This is an EXTernal layer I have added to the structure. It is meant for external logic: interfaces or data conversions. Please note that setting up a new system almost always requires you to import data from another source. This layer can take care of that. If your application has to support several customers you may even have a layer for each customer. The level of this layer is the same as the API layer. It can interact with the API layer in a bidirectional way. After all, the external layer may use business logic and business logic may use an interface from this layer. The UI layer may use objects from this layer too.

Usually these scripts are not generated by SQL Developer Data Modeler.

Conclusion

In this post you have seen the folder layout for your project. Later on in the posts for SQL Developer Data Modeler, Flyway and Maven, I will add more detail.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

February 26, 2021, 6:38 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”, I did show you the database structure.

This time I will elaborate on the base tools, the Oracle Database and Oracle APEX.

Oracle Database

How to use it?

I can tell a lot about the database but I will focus on the essentials needed to work with a front-end like APEX.

Quoting from Build your APEX application better – do less in APEX, Jeff Kemp on Oracle, 13 February 2014:

I recently saw this approach used in a complex APEX application built for my current client, and I liked what I saw – so I used a similar one in another project of mine, with good results.

Pages load and process faster
Less PL/SQL compilation at runtime
Code is more maintainable and reusable
Database object dependency analysis is much more reliable
APEX application export files are smaller – faster to deploy
APEX pages can be copied and adapted (e.g. for different interfaces) easier

I couldn’t agree more. This was quite some time ago but it is still valid. For any front-end actually. This is not just for APEX. I will repeat this again and again: you have a really powerful database for which you (or your company/client) have paid a lot. So use it well, put all the data and business logic where it belongs: the database. It will make your application faster, more secure, easier to develop and debug. And also more maintainable since there is a lot more Oracle Database expertise than for instance some obscure front-end expertise. This is a best practice since long, just do it.

And for those people who like database independence: I just like to get things done well, so I use PL/SQL since that is just the best language to work with the Oracle Database. I do not want to use JDBC (in combination with Java) or ODBC (in combination with .NET/Python) when I can use PL/SQL. And every other well designed database has some kind of language like PL/SQL. So if you want to be independent why not write a language layer for each database having the same functionality.

One more last thing. I have seen a lot of projects with Java developers using JDBC and Oracle and what has surprised me very often is the ignorance of the database they work with. The Java code just issues statements against the tables, no invocation of PL/SQL (package) procedures or functions. All functionality in the middle tier, not in the database. And the funny thing is that Oracle even has an Object Oriented layer: Oracle Object Types. It is true that Object Types are more limited than Java classes but I have created some nice applications based on the Oracle OO concept. You can also use an Object Type as a kind of glue between Oracle and another language like Java. And as a Java programmer you can also invoke REST APIs powered by PL/SQL. What a pity that a large part of the Oracle functionality is not used by those Java programmers.

What version?

My advice is to use the latest major version or the one before. So for an Oracle Database nowadays this means version 21c or the long-term support (LTS) release 19c (equivalent with 12.2.0.3). This is a simple advice for any software and it assures you that you keep up with enhancements, (security) patches and so on. Do not wait to long. Again I will use the analogy with a house: you’d better paint and maintain it regularly than wait till the wood has rotten.

What platform?

I have talked about it briefly in the first post but for a development environment I would just download the prebuilt virtual machine Database App Development VM from Oracle for my development environment. It comes with the database and APEX integrated on an Unbreakable Linux Operation System. Simple to use and backup and you can be the DBA without bothering anyone. And free of charge for a development environment.

Do not forget to make snapshots (backups) regularly. It has saved my life quite a few times.

Virtual machine settings

You may have more than one virtual machine (VM) and thus more than one database and APEX instance and you would like to have them all running and accessible at the same time? You will need port forwarding to accomplish this.

Please note that the virtual machine network configuration for the database is the same: ip address 127.0.0.1, port 1521 and instance name ORCL. Apex can be accessed through port 8080 on the virtual machine.

I have a Windows 10 laptop with two virtual machines, DEV (APEX 18.2) and VM19 (APEX 19.2).

These are the port forwarding rules for VM DEV:

And these are the port forwarding rules for VM VM19:

So the DEV database can be accessed by port 1526 on my Windows laptop and the DEV APEX instance thru the standard port 8080. The VM19 database can be accessed by port 1527 on my Windows laptop and the DEV APEX instance thru port 8082.

And this is the SQL*Net TNSNAMES configuration:

I always use the environment variable TNS_ADMIN on any platform to point to the directory of the SQL*Net tnsnames.ora file. This allows me to have one point of truth for SQL*Net even if I have several Oracle product installations.

Oracle APEX

As said before this series of articles is not an advice how to use the tools but it is just how to build a database application with a plan, an architecture. This book, Oracle APEX Best Practices, may help you with that.

Anyhow if you have followed my advice to use a virtual machine for development, you have an APEX instance now.

For me the way to go from development to production is to export the development application and import it in every next stage till it reaches production. I do not consider it a very good idea to manually apply the changes in later stages. It is certainly not the DevOps way.

Collaborating

If you need to collaborate while developing an application you need of course a shared database and APEX instance. It will be a little bit more difficult since you need to be more careful but thanks to the ability to lock APEX pages and the Build Options you can manage.

Parallel development

The problem with APEX is that you can not really install parts of it: it is all or nothing. So even if you split the APEX application export file using the SQLcl client, you can not just use some files. You have to use them all.

This influences also parallel development (branching if you prefer).

My current client has a large number of APEX applications, one of which is a doozy. It is a mission-critical and complex application in APEX 4.0.2 used throughout the business, with an impressively long list of features, with an equally impressively long list of enhancement requests in the queue.

They always have a number of projects on the go with it, and they wanted us to develop two major revisions to it in parallel. In other words, we’d have v1.0 (so to speak) in Production, which still needed support and urgent defect fixing, v1.1 in Dev1 for project A, and v1.2 in Dev2 for project B. Oh, and we don’t know if Project A will go live before Project B, or vice versa. We have source control, so we should be able to branch the application and have separate teams working on each branch, right?

We said, “no way”. Trying to merge changes from a branch of an APEX app into an existing APEX app is not going to work, practically speaking. The merged script would most likely fail to run at all, or if it somehow magically runs, it’d probably break something.
Parallel Development in APEX, Jeff Kemp on Oracle, 23 January 2014

Things have not really changed since 2014.

Keep Apex versions aligned

Keep in mind that you cannot import an APEX application into another APEX instance if the exported version is higher than the version of APEX to import into. So exporting an APEX 19.2 application will not import into APEX 18.2. So align all your APEX versions from development till production.

Conclusion

In this post you have seen how to setup an Oracle APEX development environment and some best practices as well.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

February 26, 2021, 7:07 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

Kharnagy, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”, I told you about the Oracle Database and Oracle APEX.

This time I will discuss Oracle SQL Developer Data Modeler.

Oracle SQL Developer Data Modeler

A book I can recommend is Oracle SQL Developer Data Modeler for Database Design Mastery by Heli Helskyaho.

Just modeling

I use Data Modeler mainly for modeling, documentation and generating DDL scripts for the initial setup and incremental migration scripts later on. For other activities I use tools that suit me better, the Unix approach.

This utility allows you to define views but I do not use it since it gave me a lot of problems. A simple SQL script to create the view is just enough.

Logical Model

This is the Entity Relationship Model area where you can construct Entity Relationship Diagrams (ERD) like this:

You should really take your time to design your model and to verify it using the Design Rules described later on. This is the foundation of your application.

And do not forget to use domains whenever appropriate. You can even have one corporate domains XML file if you prefer.

Relational Models

Each Logical Model may be transformed into a Relation Model, one for Oracle Database 12c, Oracle Database 12cR2 and so on. This allows you to use the features of those versions.

My preference is to just one relational model per Logical Model to keep it simple.

Again you should really take your time to design your relational model and to verify it using the Design Rules described later on. This is the foundation of your application.

Business rules

I am old enough to remember the Business Rules Classification:

Quite a few business rules can be defined easily using Data Modeler, here some examples:

business rule	how
Department code must be numeric.	Column datatype
Employee job must be ‘CLERK’, ‘SALES REP’ or ‘MANAGER’.	Use a domain
Employee salary must be a multiple of 1000.	Domain (which lets you define a constraint)
Employee exit date must be later than hire date.	Table Level Constraints

Other constraints may not fit into Data Modeler and may need to be implemented in another way. For more inspiration I will refer to implementing business rules by Rob van Wijk.

I have had difficulties with constraints implemented by materialized views with refresh fast on commit in an APEX environment. Maybe I did it wrong, maybe the database version (Oracle Database 12) was a little buggy or maybe it works just nice in theory. I resorted to triggers and PL/SQL.

Incremental migration scripts

You can define a connection via:

Then you can use that connection to execute the Synchronize Data Dictionary functionality. This will create an incremental migration script you can use with Flyway. Sometimes you may need to tweak the generated script.

Design Rules and Transformations

One of the features I can really recommend are the Design Rules And Transformations:

Design Rules

This is the lint like tool of Data Modeler, an analysis tool that flags errors, bugs, stylistic errors and suspicious constructs. Applicable for both the Logical Model and Relational Models.

Custom Transformation Scripts

This allows you to use predefined scripts to do transformations and to define your own.

Here an example for setting the table name to plural. You usually define the entity name in singular and the table name in plural. This custom utility (Table Names Plural - custom) allows you to do it automatically:

var tables = model.getTableSet().toArray();
for (var t = 0; t<tables.length; t++){
	var table = tables[t];
	var tableName = table.getName();
 	if (tableName.endsWith("Y")) {
 		// Y becomes IES
 		table.setName(tableName.slice(0, -1) + "IES");
 		table.setDirty(true);
 	} else if (!tableName.endsWith("S")) {
 		// . becomes .S
 		table.setName(tableName + "S");
 		table.setDirty(true);
 	}
}

Configuration

You can better use one modeling project for all your applications when you use SQL Data Modeler so you can share your configuration more easily between projects and developers.

From my GitHub datamodeler project here the README:

A project to share Oracle SQL Datamodeler settings and scripts. Oracle SQL Developer Data Modeler has several global configuration items like:

preferences
design rules and transformations
default domains

Besides that there are also design preferences and glossaries but you can store them in a version control system easily unlike the global configuration.

The official way to share the global configuration between computers is to use the various import and export utilities from the Data Modeler. However this is quite time consuming and thus error prone.

An easier approach is to just backup these settings to a directory you specify as a command line option (ideally under version control). Then you can restore them when needed. This project tries to accomplish just that: KISS.

It is just a simpler and more friendly approach than using manual export and import actions between developers.

If you collaborate with others you had better keep all the folders and files the same since the configuration contains those names.

Conclusion

Here I shared some ideas about using SQL Developer Data Modeler, a tool that can construct the foundation of your application very well.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)

February 26, 2021, 7:35 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”, I told you about the Oracle SQL Developer Data Modeler.

This time I will discuss the following tools: Git, Subversion, Maven and Flyway.

Flyway

The first tool I would like to discuss is one of the cornerstones of the build architecture.

With Flyway all changes to the database are called migrations. Migrations can be either versioned or repeatable.

Why migrations?

For the non-database code side of projects we are now in control.

From the Flyway documentation, Why database migrations?:

Version control is now universal with better tools everyday.
We have reproducible builds and continuous integration.
We have well defined release and deployment processes.

But what about the database? Well unfortunately we have not been doing so well there. Many projects still rely on manually applied sql scripts. And sometimes not even that (a quick sql statement here or there to fix a problem). And soon many questions arise:

What state is the database in on this machine?
Has this script already been applied or not?
Has the quick fix in production been applied in test afterwards?
How do you set up a new database instance?
More often than not the answer to these questions is: We don’t know.

Database migrations are a great way to regain control of this mess.

They allow you to:

Recreate a database from scratch
Make it clear at all times what state a database is in
Migrate in a deterministic way from your current version of the database to a newer one

How Flyway works?

Again from the Flyway documentation:

Flyway uses a schema history table (automatically created by Flyway) to maintain the track the state of the database.
Flyway will scan the filesystem or the classpath of the application for migrations. They can be written in either Sql or Java.
The migrations are then sorted based on their version number and applied in order.
As each migration gets applied, the schema history table is updated accordingly.
With the metadata and the initial state in place, we can now talk about migrating to newer versions.
Flyway will once again scan the filesystem or the classpath of the application for migrations. The migrations are checked against the schema history table. If their version number is lower or equal to the one of the version marked as current, they are ignored.
The remaining migrations are the pending migrations: available, but not applied.

And that’s it! Every time the need to evolve the database arises, whether structure (DDL) or reference data (DML), simply create a new migration with a version number higher than the current one. The next time Flyway starts, it will find it and upgrade the database accordingly.

Incremental migrations

Also called versioned migrations. As the name already indicates these files are run only once in a database. They are usually used for SQL commands that can execute only once:

CREATE …
ALTER …
DROP …

The default naming convention of Flyway is that incremental migrations have a prefix V, a version number, two underscores and the rest is free.

I prefer to have a timestamp as version number in the Oracle date format YYYYMMDDHH24MISS. An example is thus V20210217140700__create_table_TEST.sql.

You can put more than one SQL command in an incremental migration but take care: if the script fails after having executed successfully at least one command you are in deep trouble. That’s why I prefer to have just a single command in each incremental script or I write them foolproof, guarding against unexpected situations. It depends.

Repeatable migrations

Repeatable migrations are very useful for managing database objects whose definition can then simply be maintained in a single file in version control. Instead of being run just once, they are (re-)applied every time their checksum changes.

They are typically used for:

(Re-)creating views/procedures/functions/packages/
Bulk reference data reinserts

With Flyway’s default naming convention, the filename will be similar to the regular migrations, except for the V prefix which is now replaced with a R and the lack of a version.

Although there is no order due to a version there is an order because of the name of the file. In order to minimize the number of errors or warnings during a migration I use the following naming convention:

R__<type order number>.<schema>.<type>.<name>.sql

This table shows the types (from DBMS_METADATA) and their type order number:

type	type order number
FUNCTION	08
PACKAGE_SPEC	09
VIEW	10
PROCEDURE	11
PACKAGE_BODY	14
TYPE_BODY	15
TRIGGER	17
OBJECT_GRANT	18
SYNONYM	21
COMMENT	22
JAVA_SOURCE	25

The missing numbers are used for CREATE only objects like tables, constraints and so on but they are not used for repeatable migrations so I left them out.

When the schema is not fixed I do not use the <schema> part as in R__09.PACKAGE_SPEC.CFG_PKG.sql, a package specification that defines some constants (debugging on/off and testing on/off) to be used in conditional compiling.

You must be careful with views. If you create a view that depends on another object (a view for instance) that has not yet been created, Flyway will fail unless you add the FORCE keyword.

You must be careful with views and instead of triggers. Instead triggers have the nasty characteristic of disappearing when you recreate the view. But there is a simple solution. Create the instead of trigger in the same script as the view (creating the view first obviously). Then Flyway will be your savior.

DML

As already said DML scripts can be either incremental or repeatable migrations.

Preferred order of migrations

The preferred order is:

incremental migrations
repeatable DDL migrations
repeatable DML migrations

You can influence that by choosing wisely the Flyway locations to search for migration scripts.

Why not Liquibase?

There is another competitor of Flyway: Liquibase.

I have investigated Liquibase long time ago and I saw recently that the Oracle SQLcl client supports Liquibase. I still prefer Flyway because it is so much easier to understand and use. And it handles PL/SQL code so much better.

I will quote this from an oracle-base.com article:

That’s Not How You Use It! When you look at examples of using Liquibase on the internet they all have a few things in common.

They are typically examples used to track table changes and not much else.
Like my examples, they are based on small simple schemas. This always makes sense, but issues arise with some methods when things grow.
They don’t include code objects (procedure, functions, packages, triggers, types etc.).
If they do include code objects, they assume each version of the code is in a new file. This means you’re going to lose the conventional commit history of a file you would normally expect for code. Instead you have to manually diff between separate files.
They assume people need to rollback changes to previous versions of the database through this mechanism. I think creating a rollback script for each schema change makes sense, but I think it’s a bad idea to include it in this mechanism. In my opinion all changes should move forward. So a “rollback” is really a new change applied to the database that reverts the changes. This is especially true of code related functionality.

The major issue for me is the way code objects are managed. This may not affect you if you never have code in the database, but for a PL/SQL developer, this feels like a show-stopper. As a result, I prefer to work using scripts, which are kept in source control, and use Liquibase as the deployment and sequencing mechanism. I’m sure many Liquibase users will not like this, and will think I’m using it incorrectly. That’s fine. There’s more discussion about script management here.

I can only add: if you prefer to have your PL/SQL code in a script why not your tables and so on (the incremental scripts)?

I rest my case.

Maven

The reason I have chosen Maven to be the build integration tool is its excellent support for Flyway and Jenkins. It enables you to do Continuous Integration. And yes, it is used mainly by Java projects but it can perfectly be used in a project integrating several technologies.

There is a lot of documentation about Maven but for building Oracle projects you can just begin with the:

Version control

As already stated before version control tools are necessary for a mature project.

Tools

The tools used nowadays are Git and Subversion. Git is now the standard and I would like to have used it throughout but there were two areas where I had to resort to Subversion.

Git

Git is nowadays the standard version control tool and also the standard for GitHub.com, the standard Open Source site. Not a real choice thus.

Subversion

Subversion is used because Oracle SQL Developer Data Modeler only supports this version control tool. However, luckily there is a Git Subversion bridge that allows you to treat a Git repository as a Subversion repository.

Another advantage of Subversion is that it allows you to use the Maven SCM plugin with the scm:update command. This comes in handy when you are on a Citrix server and have no way to use the command line to clone (checkout) the repository for instance. Then it is very useful to scm:checkout your repository once and later update it. For one reason or another this scm:update command does not seem to work with the Git Java implementation of the Maven SCM plugin (remember no command line git allowed so Java needed).

So that’s why I add this code in the project parent POM:

<scm>
  <developerConnection>scm:svn:https://github.com/<user>/<project>.git/trunk</developerConnection>
</scm>

See for more information: Support for Subversion clients, Github.com

Branching or not?

I am not a big fan of branching, especially not in a database environment. I prefer to have a development process where you develop your changes as feature toggles (feature on/off). The changes initially do not impact production code by using constructs like conditional compiling (available since Oracle Database 10) and APEX Build Options (or if nothing else is available if/then/else) based on a configuration (for instance a package header defining some boolean constants). Those constructions allow you to enable/disable parts of the code. In the database you could even go further using Edition Based Redefinition (EBR) but that seems only necessary for applications running 24×7. EBR allows you to run multiple versions of packages and views in parallel – depending on the application or the user, the desired version of each database object is selected.

Conclusion

In this article I have shown you the reasons for choosing Flyway, Maven, Git and Subversion. The integration between them is (very) good and thus it allows you to do Continuous Integration.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)

February 26, 2021, 7:54 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”, I told you about Git, Subversion, Maven and Flyway.

In this final article, I will discuss the following tools & methods: Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps.

Oracle SQL Developer

Oracle SQL Developer is a free, integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments. SQL Developer offers complete end-to-end development of your PL/SQL applications, a worksheet for running queries and scripts, a DBA console for managing the database, a reports interface, a complete data modeling solution, and a migration platform for moving your 3rd party databases to Oracle.

I use Oracle SQL Developer as my main database development environment but I will only discuss its External Tools section here because I needed that to launch Maven builds on a Citrix server. There the only way to launch Maven was via Oracle SQL Developer and the Java executable. The command line was forbidden, a security measure I can understand and approve of but as a developer I needed a way to launch the installation of my applications.

In the examples I assume that an environment variable MAVEN_HOME is defined that points to the Maven home environment.

External tools

The External Tools section allows you to define your own tool (command) that can be invoked either contextual (right click in an editor) or from a menu.

Go to the Tools menu and click External Tools...: External Tools menu

The tool Maven scm:update install

This command is used to update a project from a source repository and install the package contents created by Maven in the local Maven repository. I have a Maven Oracle build project stored in a (private) GitHub source repository that is used by all other Maven Oracle application projects. This Maven Oracle build project needs to be installed (locally) before it can be referenced as a dependency by an application project.

I intend to create a public open-source Maven Oracle build project later on. This project contains:

a Maven POM file for installing database migrations via Flyway
a Maven POM file for exporting and importing APEX applications
Ant and Perl scripts to support the above functionality

To create this external tool use executable java and these arguments:

-classpath "${env:var=MAVEN_HOME}\boot\plexus-classworlds-2.6.0.jar" "-Dclassworlds.conf=${env:var=MAVEN_HOME}\bin\m2.conf" "-Dmaven.home=${env:var=MAVEN_HOME}" "-Dlibrary.jansi.path=${env:var=MAVEN_HOME}\lib\jansi-native" "-Dmaven.multiModuleProjectDirectory=${file.dir}" org.codehaus.plexus.classworlds.launcher.Launcher clean scm:update scm:status scm:check-local-modification install

The Run directory is:

${file.dir}

This shows it all: Maven scm:update install

The next window allows you to define a meaningful name in the Display window: Display

And finally you have to use these settings in Integration:

This will allow you to invoke a tool (Maven command) from a POM file using the Right Click menu:

The tool Maven db

This command installs the database migration scripts using Flyway.

As you may remember from “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)” the project layout has a db folder below the project root folder. Hence the invocation of Maven is 1 level lower than the root as you can see from -Dmaven.multiModuleProjectDirectory in the following arguments:

-classpath "${env:var=MAVEN_HOME}\boot\plexus-classworlds-2.6.0.jar" "-Dclassworlds.conf=${env:var=MAVEN_HOME}\bin\m2.conf" "-Dmaven.home=${env:var=MAVEN_HOME}" "-Dlibrary.jansi.path=${env:var=MAVEN_HOME}\lib\jansi-native" "-Dmaven.multiModuleProjectDirectory=${file.dir}/.." org.codehaus.plexus.classworlds.launcher.Launcher compile flyway:repair flyway:migrate ${promptl:=option1} ${promptl:=option2}

The tool Maven apex

This command allows you to export or import an APEX application.

Same level as db since the apex folder has the same height as db, but other goals:

-classpath "${env:var=MAVEN_HOME}\boot\plexus-classworlds-2.6.0.jar" "-Dclassworlds.conf=${env:var=MAVEN_HOME}\bin\m2.conf" "-Dmaven.home=${env:var=MAVEN_HOME}" "-Dlibrary.jansi.path=${env:var=MAVEN_HOME}\lib\jansi-native" "-Dmaven.multiModuleProjectDirectory=${file.dir}/.." org.codehaus.plexus.classworlds.launcher.Launcher compile ${promptl:=option1} ${promptl:=option2}

External Tools defined

You will see this after you have defined the External Tools:

utPLSQL

In the Java developer community it is best practice to have unit tests. Unfortunately that is not the case in the Oracle developer community. Created some 20 years ago by Steven Feuerstein, the third version of utPLSQL is really very useful. Easy to use and the documentation is excellent. It even integrates with Maven thanks to the utPLSQL-maven-plugin. And there is a plugin too for SQL Developer, see Running utPLSQL Tests in SQL Developer.

utPLSQL also has a code coverage report.

Do not wait any longer, just unit test it!

I add unit test code to my packages and the test code is conditional, guarded by a global configuration package header with some constants defined for testing and debugging. This way it is easy to de-activate testing code in production (where utPLQSL should not be installed).

The following sections show an example.

Conditional compilation

As I said above the test code is conditional (only active when utPLSQL is installed in a development environment) and I use Oracle conditional compilation for that.

I will quote from this article from oracle-base.com about conditional compilation:

Conditional compilation allows PL/SQL code to be tailored to specific environments by selectively altering the source code based on compiler directives. It is considered a new feature of Oracle 10g Release 2, but is available in Oracle 10g Release 1 (10.1.0.4.0).

Compiler flags are identified by the “$$” prefix, while conditional control is provided by the $IF-$THEN-$ELSE syntax.

$IF boolean_static_expression $THEN text
  [ $ELSIF boolean_static_expression $THEN text ]
  [ $ELSE text ]
$END

Configuration package

This package specification defines the boolean constants that can be used by conditional compilation as you will see in the two following sections:

create or replace package cfg_pkg
is

c_debugging constant boolean := false;
  
c_testing constant boolean := true;

end cfg_pkg;

A package specification with a function/procedure to unit test

A unit test procedure must be prefixed by:

--%test

as you can see below:

create or replace package data_api_pkg authid current_user is

/**
 * Raise a generic application error.
 *
 * @param p_error_code  For instance a business rule
 * @param p_p1          Parameter 1
 * @param p_p2          Parameter 2
 * @param p_p3          Parameter 3
 * @param p_p4          Parameter 4
 * @param p_p5          Parameter 5
 * @param p_p6          Parameter 6
 * @param p_p7          Parameter 7
 * @param p_p8          Parameter 8
 * @param p_p9          Parameter 9
 */
procedure raise_error
( p_error_code in varchar2
, p_p1 in varchar2 default null
, p_p2 in varchar2 default null
, p_p3 in varchar2 default null
, p_p4 in varchar2 default null
, p_p5 in varchar2 default null
, p_p6 in varchar2 default null
, p_p7 in varchar2 default null
, p_p8 in varchar2 default null
, p_p9 in varchar2 default null
);

$if cfg_pkg.c_testing $then

--%suite

--%test
procedure ut_raise_error;

$end

end data_api_pkg;

You can use any name you like for the unit test procedure but adding ut_ as prefix is the convention of the previous versions of utPLSQL.

Please note that the unit test procedure ut_raise_error is only defined when cfg_pkg.c_testing is true due to the conditional compilation construction.

A package body with a function/procedure to unit test

create or replace package body data_api_pkg
is

procedure raise_error
( p_error_code in varchar2
, p_p1 in varchar2 default null
, p_p2 in varchar2 default null
, p_p3 in varchar2 default null
, p_p4 in varchar2 default null
, p_p5 in varchar2 default null
, p_p6 in varchar2 default null
, p_p7 in varchar2 default null
, p_p8 in varchar2 default null
, p_p9 in varchar2 default null
)
is
  l_p varchar2(32767);
  l_error_message varchar2(2000) := "#" || p_error_code;
begin
$if cfg_pkg.c_debugging $then
  dbug.enter($$PLSQL_UNIT || '.RAISE_ERROR');
  dbug.print
  ( dbug."input"
  , 'p_error_code: %s; p_p1: %s; p_p2: %s; p_p3: %s; p_p4: %s'
  , p_error_code
  , p_p1
  , p_p2
  , p_p3
  , p_p4
  );
  dbug.print
  ( dbug."input"
  , 'p_p5: %s; p_p6: %s; p_p7: %s; p_p8: %s; p_p9: %s'
  , p_p5
  , p_p6
  , p_p7
  , p_p8
  , p_p9
  );
$end

  if p_error_code is null
  then
    raise value_error;
  end if;

  <<append_loop&gt;&gt;
  for i_idx in 1..9
  loop
    l_p :=
      case i_idx
        when 1 then p_p1
        when 2 then p_p2
        when 3 then p_p3
        when 4 then p_p4
        when 5 then p_p5
        when 6 then p_p6
        when 7 then p_p7
        when 8 then p_p8
        when 9 then p_p9
      end;

    l_error_message := l_error_message || "#" || l_p;
  end loop append_loop;

  -- strip empty parameters from the end
  l_error_message := rtrim(l_error_message, '#');

  raise_application_error(c_exception, l_error_message);
  
$if cfg_pkg.c_debugging $then
  dbug.leave;
exception
  when others
  then
    dbug.leave_on_error;
    raise;
$end  
end raise_error;

$if cfg_pkg.c_testing $then

procedure ut_raise_error
is
  l_msg_exp varchar2(4000 char);
begin
  for i_idx in 1..6
  loop
    begin
      case i_idx
        when 1
        then l_msg_exp := '#'; raise_error(null);             
        when 2
        then l_msg_exp := '#abc'; raise_error('abc');
        when 3
        then l_msg_exp := '#def#p1'; raise_error('def', p_p1 =&gt; 'p1');
        when 4
        then l_msg_exp := '#ghi##p2'; raise_error('ghi', p_p2 =&gt; 'p2');
        when 5
        then l_msg_exp := '#jkl#########p9'; raise_error('jkl', p_p9 =&gt; 'p9');
        when 6
        then l_msg_exp := '#MNO#a#b#c#d#e#f#g#h#i'; raise_error('MNO', p_p1 =&gt; 'a', p_p2 =&gt; 'b', p_p3 =&gt; 'c', p_p4 =&gt; 'd', p_p5 =&gt; 'e', p_p6 =&gt; 'f', p_p7 =&gt; 'g', p_p8 =&gt; 'h', p_p9 =&gt; 'i');
      end case;
      raise program_error;
    exception
      when others
      then
        case i_idx
          when 1
          then
            ut.expect(sqlcode, 'sqlcode '|| i_idx).to_equal(-6502);
            
          else
            ut.expect(sqlcode, 'sqlcode '|| i_idx).to_equal(c_exception);
            ut.expect(sqlerrm, 'sqlerrm '|| i_idx).to_be_like('%'||l_msg_exp||'%');
        end case;
    end;
  end loop;
end ut_raise_error;

$end

end data_api_pkg;

Please note that the unit test procedure ut_raise_error is only defined when cfg_pkg.c_testing is true due to the conditional compilation construction. And the debugging sections with package dbug are also only active when the cfg_pkg.c_debugging constant is true.

SonarQube

A small introduction:

SonarQube empowers all developers to write cleaner and safer code. Join an Open Community of more than 200k dev teams. Enhance Your Workflow with Continuous Code Quality & Code Security Thousands of automated Static Code Analysis rules, protecting your app on multiple fronts, and guiding your team.
Your teammate for Code Quality and Code Security

So where does SonarQube fit in the DevOps picture? I consider it a complement of utPLSQL and it covers the code quality and security part where utPLSQL covers the functionality part of your application.

So SonarQube allows you to make code quality and security part of your development and test cycle and it will warn you when there is a defect of any kind. If you use it with a build tool like Maven it will stop the build when there is a defect since that is treated as an error. So you have immediate feedback.

As shown by the SonarQube PL/SQL rules this static code analysis tool improves the quality of code. It is a lint like tool. Other PL/SQL lint checkers are described in this article from Steven Feuerstein.

Here are some rules:

Inserts should include values for non-null columns
Predefined exceptions should not be overridden

But I really question this rule:

Quoted identifiers should not be used

SonarQube allows you to define your own rules and can be run by Maven using SonarScanner for Maven.

Perl

A long time personal favorite since I started developing. Maybe not as sexy as Python but it works for me. I use it as soon as I need to process Operating System tasks like working with files and so on.

Some people may prefer Bash (Unix/Linux) or Powershell (Windows) scripting but I like my development tools to be platform independent so I can use them in any project.

Ant

Another long time favorite that I use for executing simple tasks on the Operating System where dependencies are necessary. I use Ant to export and import APEX applications. It invokes the Oracle SQLcl client with the appropriate switches. Ant integrates well with Maven due to the Apache Maven AntRun Plugin.

SQLcl is a new Java-based command-line interface for Oracle Database.
By Jeff Smith, September/October 2015

DevOps

DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality. DevOps is complementary with Agile software development; several DevOps aspects came from the Agile methodology.
DevOps, Wikipedia

From the same Wikipedia page I repeat this list of DevOps processes:

Coding – code development and review, source code management tools, code merging.
Building – continuous integration tools, build status.
Testing – continuous testing tools that provide quick and timely feedback on business risks.
Packaging – artifact repository, application pre-deployment staging.
Releasing – change management, release approvals, release automation.
Configuring – infrastructure configuration and management, infrastructure as code tools.
Monitoring – applications performance monitoring, end-user experience.

I have shown you all the tools and techniques you may need plus all the pitfalls you may encounter while setting up DevOps (or Continuous Integration / Delivery / Deployment). I have used Jenkins in the past.

The following table shows the match between DevOps processes and tools:

DevOps process	tools
Coding	PL/SQL, front-end like APEX, SQL Developer (Data Modeler), Ant, Perl
Building	Maven (using Flyway, Ant)
Testing	Maven (using utPLSQL, SonarQube)
Packaging	Maven (using Artifactory, Nexus or GitHub)
Releasing	Not covered by any of the tools I have described
Configuring	Not covered by any of the tools I have described
Monitoring	Not covered by any of the tools I have described

DevOps processes and tools

Please note that I have not mentioned Jenkins as tool above. Jenkins is the tool to invoke Maven on a remote integration server. Locally you do not need Jenkins: you just run the appropriate Maven commands from the command line.

An important aspect that I may not have mentioned before is that a Maven build fails as soon there is an error. The same is true for a Jenkins build that fails if one of its build steps fails.

Conclusion

This was the final article on “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end)”. I hope to have the inspiration, time and support to write a book about it some day…

I hope you have enjoyed it!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)

March 5, 2021, 2:19 am

≫ Next: GitHub Actions and SonarCloud

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)

What’s in a name?

A title is important and I hope that it describes well what I do want to share with you in this series of articles. It is not so much about how to use the back-end part (Oracle Database) or the front-end (Oracle APEX, Java, Node, React or ADF). It is much more about the tools, techniques and best practices around them in order to build, deploy and maintain an Oracle PL/SQL application efficiently and correctly.

I have used the word build on purpose and not something like develop because I see an analogy with building a house. You don’t build a house by just buying parts like a door and some tools. No you need a plan, an architecture if you prefer. And how often I see people beginning with creating a table, a UI screen and then they think they are doing well. Maybe their boss/client is happy because s/he sees something visible but IMO they just started without a plan. You just DON’T start with a door and some tools when you need to build a house, so do not make the same mistake when you build an Oracle application.

And there are other build analogies:

the tool Ant needs a build file to execute tasks;
the Unix programs are usually installed after they are being built from source.

Why this subject and why now?

Some time ago a former manager of mine complimented me on my LinkedIn account for a Blog called About Oracle apex and translations (2). His message was that is good to share knowledge as he is doing himself now too. And yeah, I whole-heartedly agree. And when my current boss wanted me to write about how I build Oracle APEX applications from the beginning till the end, I thought let’s do it before I leave the company. The funny thing is that I have been introduced to my current boss by another former manager that shares the same first name as the other manager. Thanks Harm I and II, for your gentle words. Thank you boss for pushing me to write about “How to build an Oracle Database application”.

Already soon in my career I invented solutions for not installing applications manually. Sometimes the boss/manager/team did not see a value added right away but after some time they got convinced, well almost all of them. In the Oracle Designer era, I repeated this while working on an assignment for the ING bank in Amsterdam. And in 2015 when I was working for pension fund MN in The Hague, The Netherlands, I joined the Continuous Integration team and started to assemble the ideas I am going to present to you here.

Actually I think I have enough material for a book or maybe even more. But let’s just start with a Blog and see what’s comes of it. I won’t dive too much into details but I assure you that with the help of my ideas you are better prepared to build a serious Oracle PL/SQL application. And you can always contact or hire me if you need more explanation :).

Philosophy

I believe very much in the Unix approach of handling tasks. In Unix there are a lot of simple tools that each in their own perform their task very well. I like that approach so much because it enables you to use the same tools over and over again. A win-win for you and your boss or client. Of course you may decide to replace a tool but the idea is clear. And to be honest I do not like it to learn every year another methodology or tool. I advance but not too fast but not too late either. I assume that you do not rebuild your house every time there are new techniques.

An example of this Unix approach is deploying your Oracle application software. I use Flyway because that is a tool based on a simple idea: it executes database migration scripts automatically and stores the result of the execution in a history table so Flyway knows what has been installed in order to determine what to install the next time. It is even not necessary to install version N+1 first if the latest version installed was N, you can immediately continue with N+2 (or N+3 or …). Simple and predictable, so I see no reason to use another tool. I hope this convinces you to never again execute database migration scripts manually.

Of course I have looked at the Supporting Objects feature of APEX but I think it is only suitable for (demo) applications with a small number of database objects that do no change. As soon as you build a real application you will create a lot more database objects like packages and views and it becomes too difficult to use APEX Supporting Objects, at least that is my opinion. And do not forget that APEX is just the front-end so if you decide to replace it by another front-end you also have to find another tool to run the migration scripts.

So embrace the Unix philosophy and use Flyway to run database migration scripts. I will describe Flyway in more detail in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”.

Another important point is to use the power of the Oracle Database. It is an expensive product but very powerful so use it thoroughly and get used to it. Take lessons, courses, read books, read Blogs but invest in it. It will really help you to build better.

The last point is that we should be very vigilant regarding security, so just apply all the best practices there are.

Standing on the shoulders of giants

If I have seen further it is by standing on the shoulders of Giants.
Isaac Newton in 1675

I have to mention Oracle gurus like Tom Kyte and Steven Feuerstein but I should surely mention a fellow Dutchman Rob de Wijk who has written Blogs about implementing business rules in 2008 and Professional Software Development using Oracle Application Express in 2013. We are living in 2021 now and things have advanced but I have used his ideas to lay the foundation.

Architecture

Database structure

As always a picture is worth a thousand words so I will show the picture first:

These layers are schemas in the database and folders in our application project environment.

Quoting Rob van Wijk:

This layered approach is a choice we’ve made to enhance security and flexibility in our applications. The three schemas only have a minimal set of system privileges, just enough to create the object types needed for that layer.

DATA

This is the schema that contains the data: the tables and all objects needed to maintain the data logic. You may decide to put data logic packages in the API layer but that is up to you.

The schema structure allows the UI layer to use objects from the DATA layer but I think that should be only allowed for read access to simple tables, think of List Of Values. All business logic should go via the API layer. It is simple to define a view or package in the API layer that can be used for DML purposes.

API

This is the schema that contains the business logic. It may contain data logic packages if you do not want to have packages in the data layer.

UI

All User Interface logic. When the front-end is APEX this means that this schema will be the parsing schema. Please note that you can have more than one parsing schema per APEX workspace so there is no problem having several applications with different parsing schemas in a workspace.

EXT

Tools, techniques and best practices

Oracle Database and Oracle APEX

I have used Virtualbox and the prebuilt virtual machine Database App Development VM from Oracle for my development environment. I strongly believe in having a separate database for each developer while developing since I do not want to interfere with others and I do not want that others interfere with me while I work. At a later stage you can always use an integration or test database to see if everything works well together.

Oracle SQL Developer Data Modeler

Maybe less well known than its big brother Oracle SQL Developer but a tool that allows you to build a great model (plan) of your database application. You can use various modeling techniques like Entity Relationship Modeling and a lot, lot more. It even allows you to create database scripts or migration scripts that you may use in Flyway.

A book I can recommend is Oracle SQL Developer Data Modeler for Database Design Mastery by Heli Helskyaho.

You can better use one modeling project for all your applications when you use SQL Data Modeler so you can share your configuration more easily between projects and developers.

Version control

This is absolutely necessary IMHO. Use whatever tool you like, Git or Subversion for instance, but use it. How often I needed to compare a script with an older version I can not tell you, but it was often. And sometime I had to just to throw away a concept to start over. And that is just a small part of the advantages of a version tool. When you work in a team it is a sine qua non.

SQL Data Modeler only supports Subversion but sites like GitHub support both Git and Subversion.

Maven

Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project’s build, reporting and documentation from a central piece of information.

So Maven will be the tool to automate several tasks like running Flyway, exporting and importing APEX applications or running unit tests.

Flyway

Already described, integrates very well with the tools above.

Oracle SQL Developer

Oracle SQL Developer is a free, integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments. SQL Developer offers complete end-to-end development of your PL/SQL applications, a worksheet for running queries and scripts, a DBA console for managing the database, a reports interface, a complete data modeling solution, and a migration platform for moving your 3rd party databases to Oracle.
Oracle SQL Developer

So this tool is already a great asset for a database developer but it is absolutely necessary when your DBA only allows you to access this tool and Java in a Citrix environment where the command line or Maven is forbidden. After all, Maven is just launching Java with some command line options. And you can launch a program from the SQL Developer External Tools.

utPLSQL

A PL/SQL unit testing framework originally developed by Steven Feuerstein, we now have version 3. An impressive piece of work and easy to use. In the Java community it is normal to unit test but not so in the Oracle community. This tool may convince you!

SonarQube

A tool that might help with PL/SQL static code analysis is SonarQube. Used in combination with utPLSQL this tool will improve the quality of your application code. Please note that this tool is not open-source.

Perl

I have learned Perl, the Practical Extraction and Report Language, a long time ago and it still helps me with doing some scripting tasks. So there is no reason for me to switch to Python or something else.

Ant

Ant has just been described before and it interacts well with Maven and is sometimes more simple to use than Maven.

DevOps

DevOps is a set of practices that works to automate and integrate the processes between software development and IT teams, so they can build, test, and release software faster and more reliably.
Atlassian

The tools and techniques I have described can be used from simple to complex. So from a single person running Maven from the command line to a team using Jenkins and the free artifact repository Nexus to build a Continuous Deployment pipeline based on Maven.

Conclusion

I hope I have given you enough appetite to continue reading this series of articles about building Oracle applications. Apart from the Oracle Database almost all tools are open source (and mature) so you can use that argument to convince your boss. And some tools also have a (paid) support option if that is needed.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

GitHub Actions and SonarCloud

April 9, 2021, 10:44 am

≫ Next: SonarCloud: OWASP Dependency-Check reports

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)

GitHub Actions allow you to do most CI/CD tasks for free, directly from your GitHub repository. One of the challenges however is that there is no build-in facility like for example SonarQube to manage code quality. Luckily, SonarSource provides SonarCloud; a SonarQube SaaS offering which is free for public projects! It is also easy to feed SonarCloud from GitHub Actions. In this blog post I’ll describe how you can do this.

GitHub repository

I used the following GitHub repository with some Java code to generate code quality information. In order to do that I had the following entries in my pom.xml file;

SonarCloud configuration

In order to feed data to SonarCloud, some configuration needs to be done on the SonarCloud and GitHub side. First login to SonarCloud using your GitHub account.

Next you have to authorize SonarCloud:

You can now add a GitHub organization you are using to SonarCloud by clicking + next to your account.

I chose my personal organisation. SonarCloud will be installed as a GitHub App for that organization.

You can grant SonarCloud access to your repository

In SonarCloud you can now create an organisation

And analyse a new project

When you click Set Up, SonarCloud suggests doing analysis with GitHub Actions (which is of course fine by me).

In your GitHub repository, you need to create a token so GitHub can access SonarCloud:

GitHub Actions

Conveniently, SonarCloud provides an instruction on what you need to do in order to allow the GitHub Actions to feed SonarCloud. These include updating your pom.xml file to specify the target for the SonarSource plugin and creating a workflow or adding some actions specific to the analysis. Specific are the shallow clone option, the SonarCloud artifact cache and of course the build and analyse step.

In the example workflow given by SonarCloud, the build is triggered on every commit. I changed this to do it manually. You can browse my workflow definition here.

After the results have been fed to SonarCloud, you can browse them there;

Limitations

There are of course some limitations on usage for the free GitHub and SonarCloud accounts. Next to that however, SonarCloud does not allow 3rd party plugins. It is a SaaS offering and allowing 3rd party plugins would cause an additional burden on managing the environment and in addition possible licensing issues. For some code quality aspects however, using 3rd party plugins is currently the only option. Examples of these are the OWASP Dependency-Check and OWASP ZAP. Processing output of those tests is currently not supported in SonarCloud. You can however feed it with SpotBugs (the spiritual successor of FindBugs), PMD and code coverage data. To work around the 3rd party plugin limitation, you could possibly convert the Dependency-Check and ZAP data and merge it with the SpotBugs/PMD output and feed that to SonarQube. I haven’t tried that yet however.

The post GitHub Actions and SonarCloud appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

SonarCloud: OWASP Dependency-Check reports

April 11, 2021, 3:02 am

≫ Next: Preparation for migrating data to Oracle Virtual Private Database

≪ Previous: GitHub Actions and SonarCloud

SonarCloud is a hosted SonarQube SaaS solution which helps you with code quality management. It is free to use for open source projects. You cannot install 3rd party plugins in SonarCloud however. This puts some limitations on the kind of data you can put in SonarCloud. For Java this is limited to Checkstyle, PMD and SpotBugs reports. OWASP provides a Dependency-Check plugin to identify vulnerable dependencies in for example your pom.xml file. In this blog post I’ll show how to get OWASP Depedency-Check data in SonarCloud without using a 3rd party plugin! Disclaimer: this solution has been created in ~2 hours and has not been seriously tested, optimized or used in production environments. Use at your own risk!

Method used

SonarCloud can import CheckStyle, PMD, SpotBugs result data. The output XML files which are generated by those plugins, conform to a specific format. SpotBugs and PMD provide an XSD for that. CheckStyle doesn’t have one (read here). The Dependency-Check results also have an XSD (here).

I checked out the different XSDs and decided the PMD XSD was easiest to use. I created an XSLT transformation to transform the Dependency-Check result to a PMD result file and send that to SonarCloud. Although SonarCloud displayed the ‘Vulnerabilities’ as ‘Code Smells’ without tags, the results are definitely usable!

Build process

In my pom.xml first the Dependency-Check report needed to be generated before I could perform a transformation. When performing the transformation, I needed to have XSLT 2.0 support to easily get the current date/time for a timestamp. This required an additional dependency. You can take a look at my pom.xml file here. I executed a

“mvn -B verify org.sonarsource.scanner.maven:sonar-maven-plugin:sonar -Dsonar.java.pmd.reportPaths=target/pmd.xml,target/dependency-check-report-pmd.xml”

to generate the report and send it to SonarCloud. Notice you can specify a comma separated list of PMD files to send. Checkout my GitHub workflow for more details on the exact build process and if you’re interested, this blog post on how I setup GitHub Actions and SonarCloud interaction.

Transformation

I created the following transformation (dependencycheck_to_pmd.xsl) which you can download here:

Transforming a Dependency-Check report to a PMD report

I did encounter some challenges;

The current-dateTime function which required XSLT 2.0. This required an additional dependency (Saxon). in my pom.xml file.
Transforming the CVSS3 rating to a PMD severity rating. PMD uses 1 for highest severity and 5 for lowest. CVSS3 uses 10 for highest and 0 for lowest.
The file the issue refers to is required to exist in your code. Supplying the JAR file which causes the issue did not work so I set it to my pom.xml
Required fields like line number. 0 is not allowed so I set them to 1. Determining the exact line in the pom.xml which caused a specific dependency to be included, did not seem like it was easy to do.
I did not find the externalInfoUrl in SonarCloud in a location I could click on. Now you have to go to the NVD site yourself and look for the issue if you want more information.

Result

The result of feeding the resulting PMD results file to SonarCloud was that I could see the issues with correct severity in SonarCloud with several interesting fields in the description like the CVSS score and CVE code.

This does look a bit worse though than using a ‘native’ Dependency-Check report and 3rd party plugin in SonarQube. For example, tags are missing and they are reported as “Code Smell” instead of “Vulnerability”. Also more vulnerabilities are reported when using this method compared to the SonarQube setup. I have not looked into this in more detail but since they refer to the same file, fixing that will probably get rid of all issues.

SonarQube with a 3rd party Dependency-Check plugin

The post SonarCloud: OWASP Dependency-Check reports appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Preparation for migrating data to Oracle Virtual Private Database

August 10, 2021, 6:03 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

≪ Previous: SonarCloud: OWASP Dependency-Check reports

Introduction

Recently I was part of a team involved in the preparation of migration data belonging to multiple business units into a single Oracle 19c database with Virtual Private Database (VPD). The VPD solution is used for the virtual separation of the data per business unit.

In this article I will be describing some of the issues that we encountered when preparing for the migration of ‘similar’ business data into one target Oracle database that has Virtual Private Database enabled. Possible solutions for these issues are described as well. Similar in this case means that the source databases contain identical schemas and only differ in the data that is stored in the various tables. Each single database contains the data of one specific business unit.

After migration, all the data retrieved from the source databases will be housed in one target database. The data will be virtually separated per business unit, which is done by means of the Oracle Virtual Private Database functionality.

The aim for this migration is a huge cost saver for the business. For reasons, think of the decrease in Oracle license costs, but also the less effort needed from the support department because they will now only have to manage one database instead of multiple. Other gains will or can be that software only needs to be enrolled once. Instead of provisioning software to multiple separate databases that are in operation for the different business units, this only needs to be done once now. And so, the business units will be using the same (latest) software versions, unless for business reasons otherwise decided upon.

Quick links to the parts of this article:

Virtual Private Database

How this VPD example works

Example

Enabling the virtual private database mechanism

Loading the data of a business unit into the target VPD database

Cleanup after loading the data

Some of the issues to address

Data structures

Primary and foreign key constraint definitions

Sequences

Unique indexes

Function based indexes

Unique key constraints

Unique key constraints and indexes that are based on nullable columns

Case vs Decode

Columns that have the Long type

Loading the data

Virtual columns

Result and wrap up

Virtual Private Database

Oracle Virtual Private Database creates security policies to control database access at the row and column level. It was introduced with Oracle Database version 8i and is still available in current versions today (at the time of writing Oracle database 19c). A simple example is a VPD that might only restrict access to data during business hours. A more complex example concerns a VPD that accesses/reads an application context via a login trigger and enforce row level security against the different tables that are in the database schemas. The latter is what will be addressed in this article.

How this VPD example works

So, the data belonging to the different business units will be loaded into one VPD enabled database. Before loading the data, each table in this scenario will be enhanced once with an additional column that will be indicating the business unit that owns the data. This column will be named BUSINESS_UNIT_VPD_ID in the examples below.

A unique service will be defined once for each business unit that has its data loaded into the VPD enabled database. Connections to the database for a specific business unit is made via this dedicated service. This makes it possible to have a logon trigger fired upon each connection made, which sets a unique value bound for this business unit into the session context.

The value for the BUSINESS_UNIT_VPD_ID column (which is added once to the tables part of the VPD solution) will be determined automatically through logic set as a default column value. This logic retrieves the value for the business unit from the session context mentioned above.

Example

In this example there exists 3 services and each service is for a specific business unit:

· Service1 is for/with business unit 100;

· Service2 is for/with business unit 200;

· Service3 is for/with business unit 300.

Suppose the VPD enabled database contains a table called EXAMPLE and contains the following data in it (as seen by the SYS user, who sees all data rows).

VPD_EXAMPLE
SOME_COLUMN	BUSINESS_UNIT_VPD_ID
Foo	100
Bar	200
Baz	300

Now, if a connection is made to Service1 with a user other than SYS, only the FOO data will be listed as a query result. The same applies for the other two services which will only list the Bar data and Baz data respectively.

Enabling the virtual private database mechanism

To enable the VPD mechanism on a database, a user that has the permissions to do so is needed. In the example below a user with the name VPD_ADMIN will be created and used for this.

For the creation of the VPD_ADMIN user connect to the target database as SYS AS SYSDBA.

CONN / AS SYSDBA
DEFINE vpdadminpw = 'SomeHardToGuessSecret'

Create a user that will be the admin for the VPD.

CREATE USER vpd_admin IDENTIFIED BY &vpdadminpw ACCOUNT UNLOCK
QUOTA UNLIMITED ON users
DEFAULT TABLESPACE users;
--
GRANT CONNECT, RESOURCE TO vpd_admin;

Grant the user the appropriate rights for creating VPD objects.

-- requirements for creating vpd objects
GRANT CREATE ANY CONTEXT, ADMINISTER DATABASE TRIGGER TO vpd_admin;
GRANT EXECUTE ON DBMS_SESSION TO vpd_admin;
GRANT EXECUTE ON DBMS_RLS TO vpd_admin;

-- requirements for managing services
GRANT EXECUTE ON dbms_service TO vpd_admin;
GRANT READ ON dba_services TO vpd_admin;
GRANT READ ON v_$active_services TO vpd_admin;

Now switch your connection to the vpd_admin user you just created.

CONNECT vpd_admin/&VPDADMINPW

Create a table for keeping track of the connections to the VPD enabled database.

CREATE TABLE connection_log (
  username varchar2(32)
, os_user varchar2(32)
, remote_ip varchar2(16)
, service varchar2(32)
, found_business_unit_service_name varchar2(20)
, found_business_unit_vpd_id number(6)
, reason varchar2(80)
, log_datetime date;

Also create a table that binds service names to the VPD identifiers.

CREATE TABLE business_units (
  business_unit_service_name varchar2(20)
, business_unit_vpd_id number(6)
, primary key(business_unit_service_name, business_unit_vpd_id)
);

With the business_units table in place, now add the service names and bounded business unit identifiers into it.

INSERT INTO vpd_admin.business_units (business_unit_service_name, business_unit_vpd_id) values ('Service1', 100);
INSERT INTO vpd_admin.business_units (business_unit_service_name, business_unit_vpd_id) values ('Service2', 200);
INSERT INTO vpd_admin.business_units (business_unit_service_name, business_unit_vpd_id) values ('Service3', 300);
COMMIT;

Then create and start the services, so these can be used to make connections with them.

EXEC DBMS_SERVICE.CREATE_SERVICE('Service1', 'Service1');
EXEC DBMS_SERVICE.CREATE_SERVICE('Service2', 'Service2');
EXEC DBMS_SERVICE.CREATE_SERVICE('Service3', 'Service3');
EXEC DBMS_SERVICE.START_SERVICE('Service1');
EXEC DBMS_SERVICE.START_SERVICE('Service2');
EXEC DBMS_SERVICE.START_SERVICE('Service3');

Also create the following function, which will be called from a policy that will be attached to each table that will be managed by the VPD mechanism.

CREATE OR REPLACE FUNCTION business_unit_vpd_policy_by_id
(p_schema IN VARCHAR2, p_table IN VARCHAR2)
   RETURN VARCHAR2
AS
BEGIN
   RETURN 'business_unit_vpd_id = SYS_CONTEXT(''business_unit_ctx'', ''business_unit_vpd_id'')';
END;

Attach a policy to each table that must have the VPD mechanism attached to it (so replace schema_name and table_name with actual values).

DBMS_RLS.ADD_POLICY(
   object_schema => 'schema_name',
   object_name => 'table_name',
   policy_name => 'business_unit_vpd',
   function_schema => 'vpd_admin',
   policy_function => 'business_unit_vpd_policy_by_id',
   policy_type => DBMS_RLS.CONTEXT_SENSITIVE,
   namespace => 'business_unit_ctx',
   attribute => 'business_unit_vpd_id'
);

Extend each of the tables that have the mentioned policy attached with an additional column BUSINESS_UNIT_VPD_ID, which may be hidden, and have it assigned the value of the business unit from the system context as a default.

ALTER TABLE table_owner.table_name
   ADD (business_unit_vpd_id NUMBER INVISIBLE
      DEFAULT sys_context('business_unit_ctx','business_unit_vpd_id')
      NOT NULL
   );

Loading the data of a business unit into the target VPD database

For loading the business units data into the target VPD database this had to be exported first. In the project that I was part of, the schemas that contained the data to be migrated were exported with the EXPDB utility. I will not go into further details on this utility, it is well documented on the internet.

The exported data of a business unit was then imported with the IMPDP utility with the names of the schemas altered (schema names were prefixed with a label (e.g., STG_). By doing so, the schemas don’t interfere with the target schemas that have the name originally. The renaming of the schema names during import was done by the REMAP_SCHEMA option/parameter of the IMPDP utility. Again, see the internet for further details on the usage.

Then custom made scripts were run to generate the statements to be used for loading the data into the target schemas from the ‘prefixed’ source schemas. The generated statement consists of the following:

statements to disable all the active constraints on the to be loaded tables
INSERT INTO .. AS SELECT FROM statements that queries the data from the source schemas (the ones that were ‘prefixed’ through the REMAP_SCHEMA option) and insert it into the target schema’s (the ones that have been adjusted with the VPD mechanism)
statements to re-enable all the active constraints on the to be loaded tables

After making connection for the business unit user that must have its data loaded into the VPD enabled target database, the generated statements are executed. This connection is necessary because it sets the right value for the BUSINESS_UNIT_VPD_ID into the session context. As explained before this value is then used to set the value of this column in the tables.

Cleanup after loading the data

After the data for a business unit has been loaded successfully into the VPD enabled target schemas, the ‘prefixed’ schemas can be removed from the database.

Some of the issues to address

With the mentioned definitions above in place, the Virtual Private Database mechanism is enabled. But to be able to load the data of the different business units into the VPD enabled tables some issues must still be addressed first. These issues are what is looked into next.

Data structure differences

One of the most difficult things to address is solving the differences in the data structures for the business units of which the data is to be migrated. In the project I was part of, the data structures that were in scope appeared to be identical, making this part a relatively easy one.

Quite an amount of analysis will be needed though to find out the differences that may exist between the data sources of the business units. When differences are found, a strategy to cope with these will be needed. Think for starters in bringing the structures of the business units to the same version first if this is possible, and then see if differences still exist. Because applications may depend on these data structures and specific versions this may be a huge project in itself.

Primary and foreign key constraint definitions

Loading the data of the different business units into a single VPD enabled database can result in duplicate data errors, because of failing constraint definitions. To overcome this issue, the primary key constraint and unique constraints defined upon the tables must be enhanced with the business unit identifier (the BUSINESS_UNIT_VPD_ID column, which was added to the tables in the last step of the previous section). In this section the focus is on the handling of primary key constraints. The handling of unique constraints is described later on.

As the definition of the primary keys are enhanced with the business unit identifier (the BUSINESS_UNIT_VPD_ID column), consequently also the foreign key constraints that are referencing these primary keys need to be extended.

To query all the primary keys constraints for the tables that have been altered previously, and thus now contain the BUSINESS_UNIT_VPD_ID column, the following query can be used:

SELECT cns.owner AS owner
     , cns.table_name AS table_name
     , cns.constraint_name AS constraint_name
     , LISTAGG( ccs.column_name, ', ') within group (order by ccs.position) AS columns_in_pk
  FROM dba_constraints cns
     , dba_cons_columns ccs
     , dba_tab_columns dtc
 WHERE cns.constraint_name = ccs.constraint_name
   AND cns.owner= ccs.owner
   AND cns.table_name = ccs.table_name
   AND cns.owner = dtc.owner
   AND cns.table_name = dtc.table_name
   AND dtc.column_name = 'BUSINESS_UNIT_VPD_ID'
   AND cns.owner = :b_owner
   AND cns.constraint_type = 'P'
 GROUP BY cns.owner,
        , cns.table_name
        , cns.constraint_name;

To be able to extend a primary key constraint with an additional column (BUSINESS_UNIT_VPD_ID), the foreign key constraints that reference it must be dropped first. After extending the primary key, the foreign key constraints can be recreated from its current definition with the additional column added.

The next query can be used to find out the details that can/must be used for recreating the foreign key constraints that reference the primary key constraint which is passed in as a parameter:

SELECT cns.r_owner AS fkey_table_owner
     , cns.table_name AS fkey_table_name
     , cns.constraint_name AS fkey_constraint_name
     , cns.delete_rule AS delete_rule
     , LISTAGG( distinct ccs.column_name, ', ') WITHIN GROUP (order by ccs.position) AS columns_in_fkey
     , ccs_pk.owner AS pkey_table_owner
     , ccs_pk.table_name AS pkey_table_name
     , ccs_pk.constraint_name AS pkey_constraint_name
     , LISTAGG( distinct ccs_pk.column_name, ', ') within group (order by ccs_pk.position) columns_in_pkey
  FROM dba_constraints cns
     , dba_cons_columns ccs
     , dba_cons_columns ccs_pk
 WHERE cns.r_constraint_name = :b_pkey_constraint_name
   AND cns.r_owner= :b_pkey_owner
   AND cns.constraint_type = 'R'
   AND cns.constraint_name = ccs.constraint_name
   AND ccs.owner = cns.owner
   AND ccs.table_name = cns.table_name
   AND ccs_pk.constraint_name = cns.r_constraint_name
   AND ccs_pk.owner = cns.owner
GROUP BY cns.r_owner
       , cns.table_name
       , cns.constraint_name
       , cns.delete_rule
       , ccs_pk.owner
       , ccs_pk.table_name
       , ccs_pk.constraint_name;

Now the details for recreating the foreign keys to a primary key are known, these objects can be dropped and recreated.

-- drop the foreign key constraints that reference the primary key
-- (note that this can be statements for multiple objects)

ALTER TABLE fkey_owner.fkey_table_name 
DROP CONSTRAINT fkey_constraint_name; 
 

-- then, drop the primary key

ALTER TABLE pkey_table_owner.pkey_table_name 
DROP CONSTRAINT pkey_constraint_name; 

-- recreate the primary key (including the BUSINESS_UNIT_VPD_ID)

ALTER TABLE pkey_table_owner.owner.pkey_table_name 
ADD CONSTRAINT pkey_constraint_name 
PRIMARY KEY (columns_in_pkey, BUSINESS_UNIT_VPD_ID);

-- recreate the foreign key constraints that were dropped above (including the BUSINESS_UNIT_VPD_ID).
-- If the column ‘delete_rule’ has the value ‘CASCADE’ than add the ‘ON DELETE CASCADE’ to the statement.

ALTER TABLE fkey_table_owner.fkey_table_name 
ADD CONSTRAINT fkey_constraint_name
FOREIGN KEY (columns_in_fkey, BUSINESS_UNIT_VPD_ID)
REFERENCES pkey_table_owner.pkey_table_name(columns_in_pkey, BUSINESS_UNIT_VPD_ID)
[' ON DELETE CASCADE']

The optional ‘ ON DELETE CASCADE’ clause should be added only if the queried attribute ‘delete rule’ contains the value ‘CASCADE’.

Sequences

While loading the data of the different business units into a single target, the highest value that is registered in the sequences and the corresponding values in the table columns are likely to go out of sync.

To overcome this issue, the highest value registered on a sequence in the source location (the ‘prefixed ’ schema mentioned earlier) can/should be compared to the value of the same sequence that is in the target location. If the value of the latter is lower than the one in the source location than it should be set to the higher value. This makes sure that, when a sequence is used to set the value of a column, it will get a value that was not given out before.

The below query checks the values of sequences that are in the schema to be loaded and the same sequence in the target schema and determines the value that must be set in the target.

      SELECT dss.sequence_owner AS sequence_owner
           , dss.sequence_name AS sequence_name
           , dss_stg.last_number + 1 AS new_number
        FROM dba_sequences dss
           , dba_sequences dss_stg
       WHERE dss.sequence_owner = :b_target_owner
         AND dss_stg.sequence_owner = :b_source_owner
         AND dss_stg.sequence_owner = dss.sequence_owner
         AND dss.sequence_name = dss_stg.sequence_name
         AND dss.last_number < dss_stg.last_number;

Unique indexes

In the database there may exist unique indexes that are not bound to a constraint but are defined on a table that will be managed by the VPD mechanism. To make certain these indexes remain unique and will not cause duplicate data errors, these also should be recreated with the column BUSINESS_UNIT_VPD_ID added to its definition.

The following query searches for the indexes on a per schema basis:

SELECT dis.owner AS index_owner
     , dis.index_name AS index_name
     , dis.table_name AS table_name
     , dis.table_owner AS table_owner
  FROM dba_indexes dis
     , dba_tab_columns dtc
 WHERE dis.owner = :b_owner
   AND dis.table_name = nvl(:b_table_name, dis.table_name)
   AND dis.constraint_index = 'NO'
   AND dis.uniqueness = 'UNIQUE'
   AND dis.index_type = 'NORMAL'
   AND dtc.owner = dis.owner
   AND dtc.table_name = dis.table_name
   AND dtc.column_name = 'BUSINESS_UNIT_VPD_ID';

For the found index names the current definition must be queried so these can be recreated after being dropped

SELECT aic.index_owner
     , aic.index_name
     , aic.table_owner
     , aic.table_name
     , aic.column_name
     , aic.column_position
     , aic.descend
     , ai.uniqueness
  FROM all_indexes ai
     , all_ind_columns aic
 WHERE ai.owner = :b_index_owner
   AND ai.index_name = :b_index_name
   AND aic.table_owner = ai.table_owner
   AND aic.index_name = ai.index_name
   AND aic.index_owner = ai.owner;

For recreation of the index, also find out if all of its columns are nullable or not (the BUSINESS_UNIT_VPD_ID column excluded from this)

SELECT atc.column_name AS column_name
     , atc.nullable AS nullable
  FROM all_tab_columns atc
     , dba_ind_columns dic
 WHERE dic.index_owner = :b_index_owner
   AND dic.index_name = :b_index_name
   AND atc.table_name = dic.table_name
   AND atc.owner = dic.table_owner
   AND atc.column_name = dic.column_name;

Now the index can be dropped and then recreated

DROP INDEX owner.index_name;

If the index is not based on columns that are all nullable (the BUSINESS_UNIT_VPD_ID excluded) the index can be recreated as follows:

CREATE UNIQUE INDEX index_owner.index_name
    ON table_owner.table_name (column_name…, ‘BUSINESS_UNIT_VPD_ID’);

See the section ‘unique key constraints and indexes that are based on nullable columns’ for the situation where all columns are nullable, because these needs to be recreated slightly different.

Function based indexes

Function based indexes are indexes that store the outcome value of a defined operation on a column(s) and was first introduced in Oracle 8i. Especially those that are marked as being unique must be taken care of when migrating the data of multiple business units into one database.

An example of the DDL for such a function based index is:

CREATE UNIQUE INDEX some_unique_column_idx
    ON some_table(UPPER(some_table_column_name));

The following query selects the function based indexes that are defined on the tables that have been enhanced with the BUSINESS_UNIT_VPD_ID column (if not familiar, see the ‘primary and unique constraint definitions’ step for details on this column)

SELECT dis.table_owner AS table_owner
     , dis.table_name AS table_name
     , dis.owner AS index_owner
     , dis.index_name AS index_name
     , dis.uniqueness AS uniqueness
     , dis.index_type AS index_type
     , LISTAGG(dic.column_name||' '||dic.descend, ', ') WITHIN GROUP (order by dic.column_position) AS columns_in_index
  FROM dba_indexes dis
     , dba_ind_columns dic
     , dba_tab_columns dtc
 WHERE dis.owner = nvl(:b_owner, dis.owner)
   AND dis.index_type like 'FUNCTION%'
   AND dis.owner = nvl(:b_owner, dis.owner)
   AND dic.index_owner = dis.owner
   AND dic.index_name = dis.index_name
   AND dtc.owner = dis.table_owner
   AND dtc.table_name = dis.table_name
   AND dtc.column_name = 'BUSINESS_UNIT_VPD_ID'
GROUP BY dis.table_owner
    , dis.table_name
    , dis.owner
    , dis.index_name
    , dis.uniqueness
    , dis.index_type;

In the migration project that I was part of only index types with a value of ‘FUNCTION-BASED NORMAL’ were present, and therefore this solution only handles these. If other function based index types than ‘FUNCTION-BASED NORMAL’ are encountered, the right solution for these will have to be decided upon.

The value in the column named columns_in_index will contain something like the following and is not usable when recreating the function based index:

SOME_COLUMN_NAME ASC, SYS_NC00009$ ASC, OTHER_COLUMN_NAME ASC

To find out what the definition is that is bound to this SYS_ value the following query can be run, passing in the values for index_owner and index_name and the position of the SYS_ value in the index (which is 2 in the example).

SELECT die.column_expression AS column_expression
  FROM dba_ind_expressions die
 WHERE die.index_owner = :b_index_owner
   AND die.index_name = :b_index_name
   AND die.column_position = :b_column_position;

The resulting answer will show the definition of the function, eg. UPPER(unique_column_name).

From the above query results now drop and recreate the indexes:

DROP INDEX index_owner.index_name;

Just as with the creation of unique indexes in the former section, find out if the index is created for columns that are nullable (the BUSINESS_UNIT_VPD_ID excluded):

SELECT atc.column_name AS column_name
     , atc.nullable AS nullable
  FROM all_tab_columns atc
     , dba_ind_columns dic
 WHERE dic.index_owner = :b_index_owner
   AND dic.index_name = :b_index_name
   AND atc.table_name = dic.table_name
   AND atc.owner = dic.table_owner
   AND atc.column_name = dic.column_name;

If not all columns are nullable than the index can be recreated as follows:

CREATE [NONUNIQUE, UNIQUE] INDEX index_owner.index_name
    ON table_owner.table_name (columns_in_index, ‘BUSINESS_UNIT_VPD_ID’);

See the section ‘unique key constraints and indexes that are based on nullable columns’ for the situation where all columns are nullable, because these needs to be recreated slightly different.

The value [NONUNIQUE, UNIQUE] can be determined/set to the value that is in the column named uniqueness.

Unique key constraints

Very likely, loading data from multiple business units into one target database (the one with the VPD mechanism enabled) will cause duplication errors when the defined unique key constraints, and also the foreign key constraints referencing these, are not acted upon appropriately.

The unique key constraints in this case must be enhanced with the BUSINESS_UNIT_VPD_ID attribute, that was added to the table definitions for VPD enablement. This is also the case for the foreign key constraints that reference these unique key constraints.

The next query finds the unique key constraint definitions for the tables that have been enhanced with the BUSINESS_UNIT_VPD_ID attribute. The constraints that do not have this attribute in the ‘colums_in_uk’ value must be dropped and recreated with the BUSINESS_UNIT_VPD_ID added to its definition.

SELECT cns.table_name AS table_name
     , cns.constraint_name AS constraint_name
     , LISTAGG( ccs.column_name, ', ') WITHIN GROUP (order by ccs.position) AS columns_in_uk
  FROM dba_constraints cns
     , dba_cons_columns ccs
     , dba_tab_columns dtc
 WHERE cns.constraint_name = ccs.constraint_name
   AND cns.owner= ccs.owner
   AND cns.table_name = ccs.table_name
   AND cns.owner = dtc.owner
   AND cns.table_name = dtc.table_name
   AND dtc.column_name = 'BUSINESS_UNIT_VPD_ID'
   AND cns.owner = :b_owner
   AND cns.constraint_type = 'U'
GROUP BY cns.table_name
    , cns.constraint_name
ORDER BY cns.table_name
    , cns.constraint_name;

To find the foreign key constraints that references a unique constraint from the above result, the following query can be used:

SELECT cns.owner AS owner
     , cns.constraint_name AS constraint_name
     , cns.table_name AS table_name
     , cns.r_owner AS r_owner
     , cns.r_constraint_name AS r_constraint_name
     , cns.status AS status
     , cns.delete_rule AS delete_rule
     , ccs_r.table_name AS r_table_name
     , LISTAGG( distinct ccs.column_name, ', ') WITHIN GROUP (order by ccs.position) AS columns_in_fkey
     , LISTAGG( distinct ccs_r.column_name, ', ') WITHIN GROUP (order by ccs_r.position) AS columns_in_ukey
  FROM dba_constraints cns
     , dba_cons_columns ccs
     , dba_cons_columns ccs_r
 WHERE cns.owner = :b_owner
   AND cns.r_constraint_name = :b_r_constraint_name – the unique constraint name
   AND cns.constraint_type = 'R'
   AND cns.constraint_name = ccs.constraint_name
   AND cns.owner= ccs.owner
   AND cns.table_name = ccs.table_name
   AND cns.r_constraint_name = ccs_r.constraint_name
   AND cns.r_owner= ccs_r.owner
GROUP BY cns.owner
    , cns.constraint_name
    , cns.table_name
    , cns.r_owner
    , cns.r_constraint_name
    , cns.status
    , cns.delete_rule
    , ccs_r.table_name;

If references from foreign keys are found for a unique key constraint, then to be able to recreate these constraints with the BUSINESS_UNIT_VPD_ID as an additional column, the foreign key constraints must be dropped first and the unique key constraint last.

ALTER TABLE owner.table_name 
DROP CONSTRAINT constraint_name;

Now first recreate the unique constraint

ALTER TABLE owner.table_name 
ADD CONSTRAINT constraint_name
UNIQUE( columns_in_uk, BUSINESS_UNIT_VPD_ID);

If the unique constraint is based upon columns that are nullable, than this definition can/will cause issues as will be explained in the following section ‘Unique key constraints and indexes that are based on nullable columns’. In these cases the unique constraint can be replaced with a function based index, as explained in that section.

Then recreate the foreign key constraints

ALTER TABLE owner.table_name 
        ADD CONSTRAINT constraint_name
FOREIGN KEY (columns_in_fkey, BUSINESS_UNIT_VPD_ID)
 REFERENCES (r_owner.r_table_name(columns_in_ukey, BUSINESS_UNIT_VPD_ID)
[' ON DELETE CASCADE']

The optional ‘ ON DELETE CASCADE’ option should be added only if the queried attribute ‘delete rule’ contains the value ‘CASCADE’.

Unique key constraints and indexes that are based on nullable columns.

A database may contain unique key constraints and indexes that are based on sole nullable columns. These definitions are not sound, but they may exist and will cause problems in the VPD enabled database when not dealt with properly.

So what is the matter with these definitions?

In the non VPD database setup there will be no constraint violation when 2 rows are added that have the values for these nullable columns set to null.

But in the VPD enabled database these definitions are enhanced with the business unit identifier (BUSINESS_UNIT_VPD_ID in the examples) that will get its value assigned from the context that was set at database logon (via the service dedicated to a specific business unit as explained earlier) and will not be the null value.

As a result, when 2 rows of data are inserted with the nullable columns holding the null value and the BUSINESS_UNIT_VPD_ID holding a non null value, a constraint error saying that duplicate data is not allowed will be encountered.

To deal with this problem the constraints and indexes involved can be rewritten to a function based index. In this index the column values can be tested if they are equal to null or not, and the outcome can be used to have the BUSINESS_UNIT_VPD_ID value included in the index or not.

Such a function based index definition looks like the following:

CREATE UNIQUE INDEX index_owner.index_name 
    ON table_owner.table_name (
       "NULLABLE_COLUMN_1", 
       "NULLABLE_COLUMN_2", 
       CASE WHEN "NULLABLE_COLUMN_1" IS NULL AND "NULLABLE_COLUMN_2" IS NULL 
           THEN NULL ELSE "BUSINESS_UNIT_VPD_ID" END 
    )

Case vs Decode

Originally I used a decode statement to determine the value for the column BUSINESS_UNIT_VPD_ID in the index.

The syntax that used the decode was slightly more complex, as it depended on the datatypes of the columns that were in the test to determine if all of these contained the null value.

CREATE UNIQUE INDEX index_owner.index_name
    ON table_owner.table_name(optional_numeric_column, optional_varchar_column,
DECODE(nvl(to_char(optional_numeric_column), '') || nvl(optional_varchar_column),’’) , '', null, 
   BUSINESS_UNIT_VPD_ID));

In the decode the values of the columns are tested if they contain null and if so are translated to ‘’. If the concatenation of these result in ‘’ as well then this means that all are equal to null, which results in a null value in the index, or the BUSINESS_UNIT_VPD_ID value otherwise.

Then a colleague of mine suggested to use the CASE statement instead of the DECODE.

After examining this possibility it became clear that the equivalent syntax writes/reads easier. Also because CASE is pure SQL it was chosen in favor of the DECODE variant.

DECODE can perform an equality check only, CASE can work with other logical operators as well, e.g., < ,> ,BETWEEN , LIKE.

CASE was introduced in Oracle 8.1.6 and is a sql statement, prior to that version only DECODE was available and is an Oracle function that only can be used in SQL.

Columns that have the Long type

Oracle has recommended to convert existing LONG columns to LOB (CLOB, NCLOB) columns a long time ago. Long columns are supported only for only backwards compatibility reason. As a result you may still encounter columns that have not been migrated yet and now is a good time to take care of this.

Long columns can contain up to 2 gigabytes of information. Lob columns can store up to 16 TBs, it holds a pointer to an index in the LOBINDEX segment, which points to a piece of data stored in the LOBSEGMENT segment. A lob is made up of one or more pieces of data.

The below query selects the tables that still have a column with the long datatype and generates an alter table statement for each. The statements can then be run to alter the column definitions in the target database (the one that has VPD enabled).

Columns that have the long data type are modified to clob (character large object) and columns with the long raw type data type are modified to blob (binary large object).

SELECT 'alter table ' ||
   dt.owner || '.' ||
   dt.table_name ||
   ' modify ( ' || dtc.column_name || ' ' ||
              decode(dtc.data_type,'LONG','CLOB','LONG RAW','BLOB') ||
            ');'
 FROM dba_tab_columns dtc JOIN
      dba_tables dt ON (dt.owner = dtc.owner and dt.table_name = dtc.table_name) JOIN
      vpd_admin.vpd_schemas vs on dt.owner = vs.vpd_schema
WHERE dtc.data_type like 'LONG%';

Now the target database no longer has tables with long (raw) as datatype.

Loading the data

As said, there exists a prerequisite for connecting as the business unit via its dedicated service before running the statements that will load the data for it into the VPD enabled target tables. It is essential because it will set the value for the business unit identifier (the BUSINESS_UNIT_VPD_ID) to its correct value.

Virtual columns

As told earlier, custom made scripts were used to load the data from the tables in the ‘prefixed’ schemas into the VPD enabled target tables.

While doing so, we encountered tables that hold virtual columns. Virtual columns contain/show values that are derived from other columns in the table and do not physically exist in the table.

As a consequence these columns must not be part of the INSERT INTO statements to be generated for loading the data into the VPD enabled target tables. So this makes generating the INSERT INTO statements a slight more complex because it will have to explicitly states the column names that are to be filled and queried, leaving the virtual column name out.

Thus instead of generating a statement like

INSERT INTO table_owner.table_name
SELECT * FROM stg_table_owner.table_name;

the alternative statement like

INSERT INTO table_owner.table_name (column1, column2, …)
SELECT column1, column2, … FROM stg_table_owner.table_name ;

must be generated.

For determining the virtual column this statement can be used:

SELECT dtc.column_name AS column_name
  FROM dba_tab_cols dtc
 WHERE dtc.owner = :b_owner
   AND dtc.table_name = :b_table_name
   AND dtc.virtual_column = 'YES';

Result and wrap up

After handling the issues encountered while preparing for the data migration, loading the data into one target Oracle 19c Virtual Private Database went quite smoothly. This is mainly because the source data structures were identical, and so the target data structure could remain the same. If this had not been the case, then a merged data structure would have been needed as a target. This causes that all the applications that are making use of it would likely have to be adjusted as well, increasing the effort needed to successfully migrate a lot.

Furthermore, I want to point out that the issues that were mentioned in this article are a just summarization. By no means are these meant as a complete list. So when you are planning on doing such a migration yourself, you will still have to do your own analysis and deal with the issues that occur. Of course, I hope you will be able to apply some of the solutions if you experience similar issues.

The post Preparation for migrating data to Oracle Virtual Private Database appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

February 26, 2021, 5:03 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

≪ Previous: Preparation for migrating data to Oracle Virtual Private Database

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”, I gave you an introduction.

This time I will elaborate on the database structure.

Project folder layout

The following top level directories may exist for every database application project:

Directory	Description
apex	APEX directory containing the exported APEX application and other source files like Javascript and so on.
conf	Configuration scripts for tools like Maven or Flyway.
db	Database scripts.
ddl	DDL scripts generated by SQL Developer Data Modeler.
doc	Documentation.

Directory apex

When you invoke on Windows:

tree /A apex

you will get something like this (items between ‘${‘ and ‘}’ are variables):

\---f${application id}
    \---application
        +---pages
        +---shared_components
        |   +---files
        |   +---globalization
        |   +---logic
        |   +---navigation
        |   |   +---breadcrumbs
        |   |   +---lists
        |   |   \---tabs
        |   +---plugins
        |   |   +---dynamic_action
        |   |   \---item_type
        |   +---security
        |   |   +---app_access_control
        |   |   +---authentications
        |   |   \---authorizations
        |   \---user_interface
        |       +---lovs
        |       +---shortcuts
        |       \---templates
        |           +---breadcrumb
        |           +---button
        |           +---calendar
        |           +---label
        |           +---list
        |           +---page
        |           +---region
        |           \---report
        \---user_interfaces

Directory db

tree /A db

returns:

+---${support schema}
|   \---src
|       +---dml
|       +---full
|       \---incr
+---${API schema}
|   \---src
|       +---admin
|       +---dml
|       +---full
|       +---incr
|       \---util
+---${DATA schema}
|   \---src
|       +---admin
|       +---dml
|       +---full
|       \---incr
+---${EXT schema}
|   +---src
|   |   +---admin
|   |   +---dml
|   |   +---full
|   |   \---incr
|   \---test
\---${UI schema}
    \---src
        +---admin
        +---dml
        +---full
        \---incr

Some explanation:

the support schema directory contains scripts that let the application work with support software. So you can think of another schema that contains generic error packages or packages to manipulate APEX text messages and in this directory you will add the necessary grant scripts. The support software itself is maintained by another project.
the admin directories contain scripts to setup a schema by a DBA.
dml directories contain scripts to change reference data, for instance APEX text messages or a list of countries.
full directories contain (repeatable) Flyway database migration scripts that are run every time they change (for a database). They are meant for database objects that can be replaced (CREATE OR REPLACE ...).
incr directories contain (incremental) Flyway database migration scripts that will run only once (for a database), so for objects that can not be replaced like tables and constraints or for dropping objects.

Later on in the Flyway post, I will explain in more detail the naming conventions for the Flyway migration scripts.

DATA schema

This is the schema that contains the data: the tables and all objects needed to maintain the data logic. You may decide to put data logic packages in the API layer but that is up to you.

Later on in the SQL Developer Data Modeler post, I will explain in more detail what scripts to create and where to save it in your project.

API schema

This is the schema that contains the business logic. It may contain data logic packages if you do not want to have packages in the data layer.

Usually these scripts are not generated by SQL Developer Data Modeler.

UI schema

Usually these scripts are not generated by SQL Developer Data Modeler.

EXT schema

Usually these scripts are not generated by SQL Developer Data Modeler.

Conclusion

In this post you have seen the folder layout for your project. Later on in the posts for SQL Developer Data Modeler, Flyway and Maven, I will add more detail.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

February 26, 2021, 6:38 am

≫ Next: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”, I did show you the database structure.

This time I will elaborate on the base tools, the Oracle Database and Oracle APEX.

Oracle Database

How to use it?

I can tell a lot about the database but I will focus on the essentials needed to work with a front-end like APEX.

Quoting from Build your APEX application better – do less in APEX, Jeff Kemp on Oracle, 13 February 2014:

I recently saw this approach used in a complex APEX application built for my current client, and I liked what I saw – so I used a similar one in another project of mine, with good results.

Pages load and process faster
Less PL/SQL compilation at runtime
Code is more maintainable and reusable
Database object dependency analysis is much more reliable
APEX application export files are smaller – faster to deploy
APEX pages can be copied and adapted (e.g. for different interfaces) easier

What version?

What platform?

Do not forget to make snapshots (backups) regularly. It has saved my life quite a few times.

Virtual machine settings

I have a Windows 10 laptop with two virtual machines, DEV (APEX 18.2) and VM19 (APEX 19.2).

These are the port forwarding rules for VM DEV:

And these are the port forwarding rules for VM VM19:

And this is the SQL*Net TNSNAMES configuration:

Oracle APEX

Anyhow if you have followed my advice to use a virtual machine for development, you have an APEX instance now.

Collaborating

Parallel development

This influences also parallel development (branching if you prefer).

My current client has a large number of APEX applications, one of which is a doozy. It is a mission-critical and complex application in APEX 4.0.2 used throughout the business, with an impressively long list of features, with an equally impressively long list of enhancement requests in the queue.

They always have a number of projects on the go with it, and they wanted us to develop two major revisions to it in parallel. In other words, we’d have v1.0 (so to speak) in Production, which still needed support and urgent defect fixing, v1.1 in Dev1 for project A, and v1.2 in Dev2 for project B. Oh, and we don’t know if Project A will go live before Project B, or vice versa. We have source control, so we should be able to branch the application and have separate teams working on each branch, right?

We said, “no way”. Trying to merge changes from a branch of an APEX app into an existing APEX app is not going to work, practically speaking. The merged script would most likely fail to run at all, or if it somehow magically runs, it’d probably break something.
Parallel Development in APEX, Jeff Kemp on Oracle, 23 January 2014

Things have not really changed since 2014.

Keep Apex versions aligned

Conclusion

In this post you have seen how to setup an Oracle APEX development environment and some best practices as well.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

February 26, 2021, 7:07 am

≫ Next: Apache NiFi: JSON to SOAP

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)

Last time in “How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”, I told you about the Oracle Database and Oracle APEX.

This time I will discuss Oracle SQL Developer Data Modeler.

Oracle SQL Developer Data Modeler

A book I can recommend is Oracle SQL Developer Data Modeler for Database Design Mastery by Heli Helskyaho.

Just modeling

This utility allows you to define views but I do not use it since it gave me a lot of problems. A simple SQL script to create the view is just enough.

Logical Model

This is the Entity Relationship Model area where you can construct Entity Relationship Diagrams (ERD) like this:

You should really take your time to design your model and to verify it using the Design Rules described later on. This is the foundation of your application.

And do not forget to use domains whenever appropriate. You can even have one corporate domains XML file if you prefer.

Relational Models

Each Logical Model may be transformed into a Relation Model, one for Oracle Database 12c, Oracle Database 12cR2 and so on. This allows you to use the features of those versions.

My preference is to just one relational model per Logical Model to keep it simple.

Again you should really take your time to design your relational model and to verify it using the Design Rules described later on. This is the foundation of your application.

Business rules

I am old enough to remember the Business Rules Classification:

Quite a few business rules can be defined easily using Data Modeler, here some examples:

business rule	how
Department code must be numeric.	Column datatype
Employee job must be ‘CLERK’, ‘SALES REP’ or ‘MANAGER’.	Use a domain
Employee salary must be a multiple of 1000.	Domain (which lets you define a constraint)
Employee exit date must be later than hire date.	Table Level Constraints

Other constraints may not fit into Data Modeler and may need to be implemented in another way. For more inspiration I will refer to implementing business rules by Rob van Wijk.

Incremental migration scripts

You can define a connection via:

Design Rules and Transformations

One of the features I can really recommend are the Design Rules And Transformations:

Design Rules

This is the lint like tool of Data Modeler, an analysis tool that flags errors, bugs, stylistic errors and suspicious constructs. Applicable for both the Logical Model and Relational Models.

Custom Transformation Scripts

This allows you to use predefined scripts to do transformations and to define your own.

var tables = model.getTableSet().toArray();
for (var t = 0; t<tables.length; t++){
	var table = tables[t];
	var tableName = table.getName();
 	if (tableName.endsWith("Y")) {
 		// Y becomes IES
 		table.setName(tableName.slice(0, -1) + "IES");
 		table.setDirty(true);
 	} else if (!tableName.endsWith("S")) {
 		// . becomes .S
 		table.setName(tableName + "S");
 		table.setDirty(true);
 	}
}

Configuration

You can better use one modeling project for all your applications when you use SQL Data Modeler so you can share your configuration more easily between projects and developers.

From my GitHub datamodeler project here the README:

A project to share Oracle SQL Datamodeler settings and scripts. Oracle SQL Developer Data Modeler has several global configuration items like:

preferences
design rules and transformations
default domains

Besides that there are also design preferences and glossaries but you can store them in a version control system easily unlike the global configuration.

It is just a simpler and more friendly approach than using manual export and import actions between developers.

If you collaborate with others you had better keep all the folders and files the same since the configuration contains those names.

Conclusion

Here I shared some ideas about using SQL Developer Data Modeler, a tool that can construct the foundation of your application very well.

Stay tuned!

All articles in this serie

Subject	Link
Introduction	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (1)”
Database structure	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (2)”
Oracle Database and Oracle APEX	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (3)”
Oracle SQL Developer Data Modeler	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)”
Git, Subversion, Maven and Flyway	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (5)”
Oracle SQL Developer, utPLSQL, SonarQube, Perl, Ant and DevOps	“How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (6)”

The post How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4) appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Apache NiFi: JSON to SOAP

April 25, 2022, 6:23 am

≫ Next: Quick to run, free and ephemeral RStudio instance on Gitpod

≪ Previous: How to build an Oracle Database application (that supports APEX, Java, React or any other front-end) (4)

Apache NiFi is a powerful open source integration product. A challenge you might encounter when integrating systems is that one system can produce JSON messages and the other has a SOAP API available. In this blog post I’ll show how you can use NiFi to convert JSON input to a SOAP service call. This involves abstracting an AVRO schema for the JSON, converting it to XML and transforming the XML to a SOAP message.

In this example I’m using several publicly available websites. You should of course be careful with this. Do not copy/paste sensitive XML or JSON on these sites!

Input

I’ve used the following API to develop this example: https://api.covid19api.com/summary. It will give me a JSON like:

Abstract an AVRO schema from a JSON sample

I’ve used the following site to generate an AVRO schema from the JSON sample: http://www.dataedu.ca/avro. This site has been created by Iraj Hedayati.

This created an AVRO schema like:

If you’re going to fetch data from a Kafka topic which already uses an AVRO schema, you don’t have to create your own AVRO schema since then it is already provided. The schema is required by the NiFi ConvertRecord processor, XMLRecordSetWriter controller service, to be able to generate XML from JSON.

Generate an XSD schema from XML

The ConvertRecord processor in Apache NiFi generates an XML file like;

I used the following site (here) to generate an XSD for this XML. Why would I need to create an XSD? Having an XSD makes it easier to create an XSLT file. The resulting XSD looked like:

Create a transformation from XML to SOAP

Next I manually created a target XSD based on a general SOAP definition. I took a sample from here. I generated an XSD for the SOAP message (similar as done before) and manually merged it with the previously generated schema definition. This target XSD looked like:

In order to create a transformation from the source to the target, I’ve used Altova MapForce. There are of course other solutions available such as Oxygen XML Developer. When creating XSLTs, I recommend to use a tool which has a GUI and which is also a bit intelligent because it will save you work.

The resulting XSLT looked like:

NiFi

When the AVRO schema and XSLT file are ready, the NiFi work is relatively straightforward.

Some things to notice in the configuration:

For ConvertRecord you have to specify a Record Reader and a Record Writer

In the JsonTreeReader, a controller service, I have specified the previously generated AVRO schema;

The XMLRecordSetWriter does not require any specific configuration.

For TransformXML I’ve used a SimpleKeyValueLookupService to store my XSLT file;

Validating the result

In order to validate my request, in absence of an actual SOAP service I used the following in my docker-compose.yml file (see the base file here);

  echo:
    image: mendhak/http-https-echo:23
    environment:
      - HTTP_PORT=8888
      - HTTPS_PORT=9999
    ports:
        - "8888:8888"
        - "9999:9999"

This is an echo service which lets me view the request which is done in the docker logs. This way I can see the request and reply in NiFi and the request in the logging. The endpoint to call from NiFi in this case is http://echo:8888

Of course this is not truly SOAP yet. You need to add a SOAPAction HTTP header (amongst other things). This is easy by adding a dynamic attribute on the InvokeHTTP processor.

This header will then be set when calling the SOAP service. The below screenshot shows the reply from the echo service. This indicates the header has been set.

The post Apache NiFi: JSON to SOAP appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Quick to run, free and ephemeral RStudio instance on Gitpod

February 18, 2023, 8:00 am

≫ Next: Get Going with SQL Developer against Oracle Database 23c Free

≪ Previous: Apache NiFi: JSON to SOAP

Read Time:2 Minute, 14 Second

Triggered by an assignment my son had to do for one of his university courses – and given the fairly light weight laptop we have him work with – I decided to make a cloud based RStudio environment to him. I have worked with Gitpod over the past six months, creating environments for Oracle Database, Backstage, Open Telemetry, Oracle Cloud client tools, Go development, Apache Kafka, Dapr, Redis & NodeJS workshop, MongoDB and several other technologies. They have in common that these environments are very easy to run – and very easy to reset/rerun as well as to share. They do not require any local resources – apart from the browser – and are free (at least the first 50 hours of use each month). What’s not to like?

To run this workspace – and start working with R Studio – you need t0 open this link in your browser. This will run a Gitpod workspace that takes its definition from my GitHub Repository where the file .gitpod.yml defines the steps that Gitpod goes through when preparing the workspace.

Once the Gitpod workspace is launched, you will need to wait for a few minutes while the workspace is prepared. Packages are updated, new files are downloaded and the RStudio server is installed and started. You can check the first terminal window to see what is going on – and find out when the actions are complete.

When the actions are done, you will see a message in the terminal that invites you to complete the creation of a new user – randomly called *hank* (feel free to create a different user account). You will connect in the browser to the RStudio client using this Linux user and password.

After creating the user, you can open the RStudio GUI in the browser. From the Ports tab open the URL listed for port 8787.

RStudio launches in a new browser tab. Login with username *hank* and their password. Once the login is successful, you will the RStudio client in your browser like this:, ready to start exploring:

Resources

the most important article I used :RStudio Server on Ubuntu through Windows Subsystem for Linux (WSL2) – https://www.drdataking.com/post/rstudio-server-on-ubuntu-through-windows-subsystem-for-linux-wsl2/

other resources:

* How to install RStudio Server open source on Ubuntu 20.04 LTS – https://www.how2shout.com/linux/install-rstudio-server-open-source-on-ubuntu-20-04-lts/

* Download RStudio Server – https://posit.co/download/rstudio-server/

* Getting Started with Posit Workbench / RStudio Server – https://support.posit.co/hc/en-us/articles/234653607-Getting-Started-with-Posit-Workbench-RStudio-Server

* RStudio Server Documentation – Administration – https://docs.posit.co/ide/server-pro/server_management/server_management.html

About Post Author

admin

AMIS employee

servicedesk@amis.nl

http://www.amis.nl

Happy

0 %

Sad

0 %

Excited

0 %

Sleepy

0 %

Angry

0 %

Surprise

0 %

The post Quick to run, free and ephemeral RStudio instance on Gitpod appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Get Going with SQL Developer against Oracle Database 23c Free

April 17, 2023, 3:27 am

≫ Next: Oracle Database 23c as Graph Database–SQL Property Graph for network style querying

≪ Previous: Quick to run, free and ephemeral RStudio instance on Gitpod

Read Time:3 Minute, 8 Second

In an earlier article, I showed how to quickly get going with an Oracle Database 23c Free instance in a Gitpod workspace – cloud based, ephemeral, quick start, zero install environment. As part of that workspace, you have access to both SQL*Plus and SQLcl as command line interface to work with this database. However, many of us like to work with a GUI – especially the free SQL Developer tool that over the years we have come to embrace. In this article I will show how to work with a locally installed SQL Developer environment – running on your laptop – against this cloud based Oracle Database 23c Free instance in a Gitpod workspace.

First – make sure you have got SQL Developer up and running on your local environment (starting from the SQL Developer downloads page if you do not yet have it). You also need to have VS Code on your local environment; this will be the conduit for the SSH connection over which SQL Developer will communicate with the database in the remote Workspace.

Second, start a Gitpod Workspace with the Oracle Database 23c Free instance – by following the steps in this article or simply by opening this link (that will launch the Gitpod workspace for you).

With the Gitpod workspace open and the database up and running an accessible locally in the workspace from SQLcl, you need to bring the remote workspace to your laptop. Or rather: you create an SSH connecti0n through your local VS Code that will forward any local communication with port 1521 to the remote workspace. The steps to make this happen:

1. Open the Command Palette in the Gitpod Workspace (on my Windows machine I do this using CTRL SHIFT P ) and select “Gitpod: Open in VS Code”.

Alternatively, expand the File menu and click on Gitpod: Open in VS Code

2. Accept the browser popup that notifies you of opening Visual Studio Code from the browser:

3. Click on Open when Visual Studio Code prompt you to “Allow an extension to open this URI?”

4. Click on Copy to save the temporary password for the SSH connection to the clipboard:

5. When this popup appears:

paste the password from the clipboard to the field and press enter:

6. VS Code now opens and shows the same files and the same terminals as you saw before in the browser based VS Code environment:

Open the ports tab

Here you can see the “magic” to connect the local port 1521 to the port 1521 exposed in the remote Gitpod workspace. Any attempt to access localhost:1521 on your laptop is now intercepted by VS Code and forwarded over the SSH connection to the Gitpod workspace. I think that is brilliant!

7. Time now to open SQL Developer

8. When it has started, you can define a new database connection – click on the plus icon and select New Database Connection

Then provide the connection details:

username: dev
password: DEV_PW
Hostname: localhost
Port: 1521
Service Name: FREEPDB1

For convenience sake, you can check the box for Save Password.

Click on Test to verify these connection details:

It is not spectacular in a visual way but the Status: Success message is reassuring. Now click Connect.

9. At this point, SQL Developer is connected to the DEV schema in the FREEPDB1 database running in your cloud based Gitpod workspace. And you can work against it just like you work with any Oracle Database 23c Free instance running anywhere.

Here you see a telltale sign for 23c: select sysdate – at long last without including “FROM DUAL”.

This is the situation now created:

Resources

In an earlier article : how to quickly get going with an Oracle Database 23c Free instance in a Gitpod workspace – cloud based, ephemeral, quick start, zero install environment.

The SQL Developer downloads page

About Post Author

admin

AMIS employee

servicedesk@amis.nl

http://www.amis.nl

Happy

0 %

Sad

0 %

Excited

0 %

Sleepy

0 %

Angry

0 %

Surprise

0 %

The post Get Going with SQL Developer against Oracle Database 23c Free appeared first on AMIS, Data Driven Blog - Oracle & Microsoft Azure.

↧

Oracle Database 23c as Graph Database–SQL Property Graph for network style querying

April 17, 2023, 11:19 pm

≫ Next: Graph Database style explorations of relational database with Formula One data– Oracle Database 23c Property Graph

≪ Previous: Get Going with SQL Developer against Oracle Database 23c Free

Read Time:5 Minute, 51 Second

I bet you are used to relational data structures that you query using SQL. And so do I. And there is nothing in terms of data that a good SQL query cannot answer. OK, the query can become quite long – with inline expressions and multiple joins – but it can answer almost any question without fail. While that is true, there are different perspectives on data possible. Rather than the tables and foreign keys/join condition view that we tend to take from the relational (pure SQL) world, there is a view on data that focuses on the network structure of data: the data set is defined in terms of vertices and edges. Nodes in a network and the relationships between these nodes. Some data – or: sometimes data – is better represented and analyzed from that perspective.

A quick example: tables in our database frequently references other tables through foreign keys. Using the Data Dictionary Views, we can use SQL queries to learn about these dependencies between tables. Some of those queries are not intuitive to read or write. Some questions cannot easily be answered – at least for someone not well versed in SQL and in the structure of the Data Dictionary.

This particular aspect of a relational database can easily be sketched as a property graph: one vertex type (table) and one edge type (foreign key).

Using a SQL Property Graph, the question whether we have any tables that reference themselves becomes as simple as:

MATCH (a IS database_table) -[fk IS foreign_key]-> (b IS database_table) WHERE a.table_name = b.table_name

Read this is: start with all vertices of type database_table. Follow all their edges of type foreign_key to the database_table vertex at the other end of the edge. Then look for all cases where two vertices are in fact the same table. That is a case of a self referencing foreign key.

And to find pairs of tables that both reference the same (lookup or parent) table, we can simply write:

MATCH (a IS database_table) -[fk IS foreign_key]-> (b IS database_table) <-[fk2 IS foreign_key]- ( c IS database_table)
WHERE a.table_name != c.table_name

This starts in the same way: start with all vertices of type database_table. Follow all their edges of type foreign_key to the database_table vertex at the other end of the edge. Then find all incoming (hence the <- symbol) edges to this vertex. And eliminate results where the two tables with the same common table reference are in fact the same table.

To find tables that have multiple foreign keys to one specific table:

MATCH (a IS database_table) -[fk IS foreign_key]-> (b IS database_table) <-[fk2 IS foreign_key]- ( c IS database_table)
WHERE a.table_name = c.table_name and fk.constraint_name != fk2.constraint_name

Start with all vertices of type database_table. Follow all their edges of type foreign_key to the database_table vertex at the other end of the edge. Then find all incoming (hence the <- symbol) edges to this vertex. And only keep results where the two tables with the same common table reference are the same table but the foreign key constraints are different.

And one more: find nested hierarchies of child, parent and grandparent table.

MATCH (a IS database_table) -[fk IS foreign_key]-> (b IS database_table) -[fk2 IS foreign_key]-> ( c IS database_table)

As before, start with all vertices of type database_table. Follow all their edges of type foreign_key to the database_table vertex at the other end of the edge. And from there (the parent table) follow again all edges of type foreign_key to the database_table vertex at the other end of the edge (bringing us to the grand parent).

I have created a database schema with Formula One data – the schema used by Alex Nuijten and Patrick Barel in their book Modern Oracle Database Programming – and executed these queries.

The database schema is visualized here:

Can we quickly spot self referencing foreign keys? Tables with multiple foreign keys to the same table? Nested hierarchies? Well, perhaps with a nice diagram like this one we could. It would be less easy if we do not have the diagram, if the diagram is not correct or if the number of tables is several 100 or even more than 1000.

In those cases, the property graph approach can help us.

Self referencing foreign keys:

There are none. That is a little bit disappointing.

What about common tables – tables referenced by multiple tables:

Any cases of tables having more than one foreign key to another table:

And what about nested hierarchies – trios of child, parent and grandparent table:

You may not have a huge interest in this information. Or you may feel well equipped to write your own SQL to learn this information. However, using the property graph approach can make certain investigations and data explorations much simpler and more intuitive. Not just for you but for your colleagues who may not be such wizards in SQL. If for no other reason, give the property graph some thought.

What I have not shown yet is the actual creation of the property graph. Before you can start querying against table_references_graph you need to create that object. The DDL statement to create the property graph looks as follows:

The graph definition includes all vertex tables (in this case only one) as well as all edge tables (again, here only one). Multiple labels (types of edges or vertices) can be associated with the same table. Every edge and vertex needs a KEY definition – a unique column or combination of columns. An edge needs to be defined with a source (where does the edge start from) and a destination (which vertex table is the target of the edge) For each edge and vertex, properties are defined – values based on columns or column expressions.

Edge and Vertex tables in a Property Graph cannot be defined on a View (unfortunately): they must have a real table or materialized view as their underpinning. I hope that limitation will be lifted in the future. I consider it a serious hindrance for using the property graph!

In this example, I have created two materialized views on data dictionary views, as follows:

These materialized views represent the vertex (tables_mv) and edges (foreign_keys_mv) respectively. It is a bit of an unfortunate workaround – but it will (have to) do.

Resources

SQL scripts demonstrated in this article: https://github.com/lucasjellema/gitpod-oracle-database-23c-free/blob/main/explorations/sql-property-graph.sql.

GitHub Repo with sources for this article: https://github.com/lucasjellema/gitpod-oracle-database-23c-free.

Oracle Database 23c Free environment with the Formular One data schema can be started using a Gitpod workspace definition. Read about it in this article: https://technology.amis.nl/database/live-handson-environment-for-modern-oracle-database-programming/