Just a quickie to demonstrate how easy it is to use MD5 hashes instead of the normal hash function in your owb mappings:.
case when INGRP1.TO_BE_HASHED is null then null else RAWTOHEX(DBMS_OBFUSCATION_TOOLKIT.MD5(input_string=> INGRP1.TO_BE_HASHED )) end
I used this make personal information in a table non-reversable unreadable. As you might know straightforward hashing is not unique enough in large datasets.
Thats the reason why i used an MD5 hash.
Later on in the project i was faced with the question of reverting the process. The powers that be have a way of changing their needs...
So i thought of taking the source, with plain data, md5 hash it and compare that to the stored value.
That works but when you have a large dataset you'll need an index, and this :
RAWTOHEX(DBMS_OBFUSCATION_TOOLKIT.MD5(input_string=> INGRP1.TO_BE_HASHED ))
doesn't index.
You'll need an function-based index for this.
However using a function-based index with this throws a non-deterministic error. This means that the md5 function is compiled with the deterministic flag off, and as such can not be used in fbi's.
So..
I created a wrapper function that just makes the MD5 function deterministic.
See here:
CREATE OR REPLACE FUNCTION md5er (a_hash IN varchar2) RETURN varchar2 DETERMINISTIC IS BEGIN RETURN((DBMS_OBFUSCATION_TOOLKIT.MD5(input_string=> a_hash))); END;
Then you can create the fbi on the stage table that holds the source:
See here:
create index STG_EXPOR_FBIDX1 on STG_EXPORT(RAWTOHEX(md5er(a_hash=> nvl(,'empty') )));
Fbi's need their index updated with this:
exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'stg_export', estimate_percent=>null, cascade=>true, method_opt=> 'FOR ALL HIDDEN COLUMNS SIZE 1');
update DWH.dwh_person_h ph set ph.person_number = nvl(( select xt.plain_data from ( select xt1.source_field source_field from stg.stg_export xt1 group by xt1.source_field ) xt where RAWTOHEX(md5er(a_hash=> nvl(xt.source_field,'empty') ))= ph.person_number ), ph.person_number)--for persons not in the source anymore ;
And all of a sudden another page has managed to fill itself..
BTW: i promised myself to write more often, once a year is not enough..
I have more in the works, and the XML post needs another follow-up.
No comments:
Post a Comment