Some weeks ago we have implemented delta detection from the source systems by using Oracle functions and procedures in Oracle Warehouse Builder (OWB). The situation is as follows: every run you only want the new or changed records to be processed in your mapping (most likely a staging area mapping), and you want OWB to figure out from which point on your next delta selection should be. For this example you should already have the following operators (objects) in your mapping: a source and target table and a filter operator in between. This mapping would simply load records from one table to another depending on the WHERE clause you enter in the filter. And more importantly: this solution requires the existence of an attribute in the source system which can tell you if and when records have been changed or newly entered (mutation datetime attribute).
To add the process control as described above we have to do the following:
- Create a process control table with the process name, the start date, the end date and optionally a sequence ID.
- Create a function which looks up the next start date (eg the datetime from which the delta should be selecterd in the source system).
- Create a procedure which updates the process control table if the mapping has been succesfull. The record in this table for the next run (the one which the above mentioned function uses) will have to be inserted here.
- Add two constant operators, an expression operator and a post-session operator to your mapping.
The first constant and the expression should be placed before your filter operator. In this constant you can define the name of your mapping. Using this name, you can connect to the expression as an input there for the lookup function.
This function (of course developed in OWB) has an input paramter MAPPING_NAME defined as a varchar:
--initialize variables here
CURSOR LoadDate IS (
SELECT MAX(START_DATE) as Start_Date
FROM PROCESS_CONTROL_TABLE
WHERE Mapping_Name=MAPPING_NAME
GROUP BY Mapping_Name);
v_start_date date;
-- main body
BEGIN
v_start_date := TO_DATE('18500101', 'YYYYMMDD');
< <recloop>>
FOR RecSet IN LoadDate
LOOP
v_start_date := RecSet.start_date;
EXIT recloop;
END LOOP;
RETURN v_start_date;
END;
As you can see, it selects the highest date for the next run. In your expression operator call this function using the mapping name supplied by your constant. The result will be a date/time attribute which you can drag to your filter expression where you can set that source mutation datetime attribute should be greater than your lookup date time (start_date).
The second part of this process is the post-session operator. Enter your mappingname in your second constant operator, obviously equal to the name in the first constant. Beside that enter a datetime attribute end_date with the expression SYSDATE-1. These two values can be dragged to the post-session operator which should do two things:
first, close off the current process control record, by setting the end date to the sysdate-1
second, insert a new record in which the start_date is sysdate-1
The -1 is a safe margin because during the run of the mapping new record in the source system could be inserted or altered, and you don’t want to miss out on those. This works well, as long as your mapping keeps its run time within a day. But then again; if not you have other problems than process control. Alternatively you can query the OWB repository and select the mapping starttime here.
The procedure that updates the process control table (you will need to create the two input parameters in OWB):
v_end_date DATE;
v_mapping_name VARCHAR2(30);
BEGIN
v_end_date := END_DATE;
v_mapping_name := MAPPING_NAME;
UPDATE PROCESS_CONTROL_TABLE SET END_DATE= v_end_date WHERE END_DATE IS NULL AND MAPPING_NAME = v_mapping_name;
INSERT INTO PROCESS CONTROL_TABLE VALUES (v_mapping_name, v_end_date, NULL);
END;
The NULL value in the second statement means: the end_date for the next run is empty. Of course; this will be updated the next time you run this mapping with the datetime from the delta selection.
Lastly two things to keep in mind: you may want to set the post-session operator to run only upon succes (configuration of this operator), because if the mapping fails you probably want to rerun it with the same values. And if you want to have a technical ID in your process control table I would suggest to set it up as described above, create a sequence and a trigger (unfortunately not yet doable in OWB) which insert the NEXTVAL in the ID attribute every time the new record is inserted.
Posted November 30th, 2005 by IT-eye Alumni | 5 Comments »
As you probably know by now, Firefox 1.5 has been released. Even the mainstream media is reporting on this new version. The only thing i want to add here is that it includes a fix for the bug that was causing a problem with the menu’s on the IT-eye website. Good to see that fixed. Other improvements include: improved popup blocker, automatic download of updates and support for svg.
Posted November 30th, 2005 by Andrej Koelewijn | No Comments »
Today, Shale caught my attentention on the Web. Shale is a new framework from Apache to build web applications in a MVC fashion. Shale is a Struts sub-project and it is a framework that is build on top of the new standard in the web-tier: JSF.
Craig McClanahan initiated Shale with the main idea to let it become Struts 2.0. However, when I read several blogs, eg. David Geary’s blog and read more about the Shale framework, I am convinced that it is not really Struts 2.0., but a total new framework that is build on top of JSF. In my opinion, JSF just puts Struts in a new better ‘jacket’ (For this I agree with David Geary).
I wonder how Shale’s first release will be introduced. When you look now at the Shale homepage, you can extract from the URL that it is still a Struts sub-project.. However, it is good to see that Apache is now, including with MyFaces, heading into the direction of JSF. Because, I consider JSF as the new standard WAF for the next couple of years.
Posted November 20th, 2005 by Tom Hofte | 3 Comments »
The following is a nifty trick which is used in many datawarehouses around. In any case: a time dimension is one of the most common objects of a datawarehouse and one with which you can quickly give some new data insights to your customer. If you have some easy to collect facts (e.g. financial figures or amounts) and an easy company hierarchy you can quickly show some trends in time.
To build a time dimension, use the table function supplied by the Oracle DBMS. You can’t develop these from Oracle Warehouse Builder so you need to do that in your SQL client. You can however call these functions from your mapping (surprisingly called ‘table function operator’). What you need to do is to create a function like this one with two input fields (start date and end date) and as output however you want your time dimension to be. For a table function to run, you also need to add the uses database types. In this example its the TIME_DIMENSION_DATA_TABLE type and the TIME_DIMENSION_DATA record type. The code for this is:
CREATE OR REPLACE
TYPE TIME_DIMENSION_DATA AS OBJECT
( Periode_Key NUMBER ,
DAG_NUM VARCHAR2(15),
DAG_OMS VARCHAR2(15),
WEEK_NR NUMBER,
MAAND_NR NUMBER(2),
MAAND_DAGNR NUMBER,
MAAND_NAAM VARCHAR2(15),
JAAR_DAGNR NUMBER(3),
JAAR_JAAR NUMBER(4)
);
For the record type
And
CREATE OR REPLACE
TYPE TIME_DIMENSION_DATA_TABLE AS TABLE OF TIME_DIMENSION_DATA
For the table type.
The datatypes in the record type and the table function should obviously match and need to be present in order for the table function to run. But then again, you only need to do this once.
The code for the table function itself is as follows:
CREATE OR REPLACE FUNCTION Time_Dimension_Filler
( start_date IN NUMBER -- start date (as number format YYYYYMMDD )
, eind_date IN NUMBER -- end date (as number format YYYYYMMDD )
) RETURN TIME_DIMENSION_DATA_TABLE PIPELINED IS
out_rec TIME_DIMENSION_DATA := TIME_DIMENSION_DATA(NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL);
l_start_date date;
l_eind_date date;
l_dummy_date date;
l_dag_in_week number(1);
l_weekend varchar2(1);
l_verwerken boolean;
BEGIN
l_start_date := to_date(p_start_date, 'YYYYMMDD');
l_eind_date := to_date(p_eind_date , 'YYYYMMDD');
FOR l_datum_nr in p_start_date..p_eind_date LOOP
l_verwerken := TRUE;
BEGIN
l_dummy_date := to_date(to_char(l_datum_nr,'99999999'),'YYYYMMDD');
exception
when others then
l_verwerken := FALSE;
l_dummy_dat := null;
end;
if l_verwerken then
l_dag_in_week := to_number(to_char(l_dummy_date,'D'),'9');
if l_dag_in_week in (6,7) then
l_weekend := 'J';
else
l_weekend := 'J';
end if;
out_rec.Periode_Key := l_datum_nr;
out_rec.DAG_NUM := to_char(l_dummy_date,'DD/MM/YYYY');
out_rec.DAG_OMS := to_char(l_dummy_date,'DAY');
out_rec.WEEK_NR := to_number(to_char(l_dummy_date,'WW'),'99');
out_rec.MAAND_NR := to_number(to_char(l_dummy_date,'MM'),'99');
out_rec.MAAND_DAGNR := to_number(to_char(l_dummy_date,'DD'),'99');
out_rec.MAAND_NAAM := to_char(l_dummy_date,'MONTH');
out_rec.JAAR_DAGNR := to_number(to_char(l_dummy_date,'DDD'),'999');
out_rec.JAAR_JAAR := to_number(to_char(l_dummy_date,'YYYY'),'9999');
PIPE ROW(out_rec);
END IF;
END LOOP;
RETURN;
END;
/
All that remains is to set the input/ouput field and datatypes correctly in your Table Function Operator, this listens very closely. And you’re done!
Posted November 18th, 2005 by IT-eye Alumni | 1 Comment »
This post is meant to be upgrade every now and then with new insights. First of all when setting up your own environment for Oracle Business Intelligence, be sure to install Oracle Warehouse Builder (OWB) patch 10.1.0.4 to be able to use OWB with the Oracle 10.2 database. If you don’t you will get an error message saying that your sys as sysdba user doesn’t have enough privileges. Installing OWB on Oracle 10.1 should not cause any problems.
- Use the OWB runtime assistant for creating the to-be-used database schema’s for the the targets. The use of the repository assistant for your metadata repository is obligatory.
- Think long and hard over your OWB project name and location names, editing these later on causes untold numbers of errors.
- Use OWB for development of every PL/SQL function or procedure instead of creating it in SQLPlus or Toad and importing it into OWB. You can not edit the procedures after that so you’re obliged to keep editing them in the other client
- Try to keep your mappings (data transformation processes) limited to a single target. You can (of course) have multiple sources but try to design your proces so that only one logical target is used. You can have two instances for the same target if you need a seperate insert and update flow for instance.
- A target in your mappings means the end of your process flow. Try to resist the urge to continue from that point on but make a new mapping instead which takes your target as a source for the next mapping. Continuing data tranfsormation processes though a number of targets makes debugging unnecessarily difficult.
- Try to be as explicit as possible, it’s very tempting to make solutions in PL/SQL but try to make the most of the functions available in OWB. It may seem like it’s not very efficient but in a good mapping design one can graps the meaning more quickly if your solution is spread over a number of operators (transformations), rather than a dense PL/SQL code.
- Make datatype transformations explicit! It may work well for instance to connect a numeric attribute to a varchar attribute technically but it pays to use a (OWB standard) TO_CHAR in every case.
- If you made it to the final datamart of star-schema fact table mapping. Be sure to spend some time on your pre and post session procedures. You definitely want to drop all the constraints every update run on your datawarehouse.
- The same goes for indexing, use pre and post session commands to drop or create indexes and recalculate statistics.
- When using Oracle Workflow in combination with Oracle Warehouse Builder; its wiser to implement point 8 and 9 in the workflow so everyone can watch and know the different sequential steps in the whole workflow. You need to create a Oracle Workflow repository for this from the database installer.
Posted November 17th, 2005 by IT-eye Alumni | No Comments »

Microsoft just anounced that if you download their express tools before november 7th 2006, you can use them for free. Included are visual web developer, visual basic, visual c#, visual c++, visual j++, and sql server. More info can be found on msdn: Announcing the release of Visual Studio 2005 Express Editions. But wait, there’s more! Sun has also made their Java Studio Creator IDE free.
It’s good to see so many free tools, i’ll be easier for developers to pick the right tool for the job. I’ve tried Java Studio creator in the past, but i think JDeveloper is a better IDE for building JSF applications. I’m curious to see how Visual studio compares to JDeveloper.
Posted November 9th, 2005 by Andrej Koelewijn | No Comments »