ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Case Study:
PL/SQL API as data
protection mechanism
in Oracle databases
by Ziemowit Jankowski
Database Architect
Intended use
¡ñ Large databases
¡ñ Intensive updates
¡ñ Heterogeneous environment
¡ñ Many external actors with potentially
misbehaved applications or ad-hoc
queries
¡ñ Due to enterprise nature impossible to
know who is doing what and how
¡ñ Non-existent or insufficient GUI-
capabilities for users
¡ñ Consistency in updates (triggers and
constraints)
¡ñ Updating/reading in correct context
(triggers, VPD)
¡ñ Authorized access (Oracle authorization,
roles, VPD, ACL)
¡ñ Protection against misbehaved
application:
¡ñ Reading
¡ñ Updating (e.g. improper transaction
handling)
Data protection ¨C a look at the
broader meaning
¡ñ Exposing data structures to external
actors:
o Access to ¡°more than needed¡±
o Dependency on data model
o ¡°Funny¡± updates
¡ñ Misbehaved applications and/or users
¡ñ Mal-formed SQL queries
o Sub-optimal data access paths
¡ñ Possible need for multiple
implementations of business logic
A few pitfalls for ¡°classic¡±
approach
Example scenarios
Database
(triggers,
constraints,
VPD etc)
App server
Valid
ation
Automated
application
Other
database
GUI-based
application
? Updating
? Validation of data
? One place, one algorithm
? Complex validation algorithms
? Avoiding bad application / user behaviour
? Updates independent from data model
? Retrieval
? Controlled access paths (performance)
? Validation of search conditions
? Results independent from data model
? Logging
? Security through obfuscation
Some benefits of an API in data
handling
? Software platform specific:
? .NET
? Java
? Others
? Common platform
? Oracle PL/SQL
Different API implementations
? Pros
? One point, one algorithm for all data
validation
? All ¡±internal¡± data validation close to data
store (overhead elimination)
? No ¡±out of control¡± access paths
? Cons
? PL/SQL based
? Hard to understand for novice programmers
Oracle PL/SQL API
Example scenarios revisited
Container schema
(data, metadata, stored
code)
App server
Validation
Automated
application
Other
database
GUI-based
application
Access
schema
Access
schema
Access
schema
? Container schema includes:
? Data containers (tables, views, mviews)
incl. triggers, constraints etc.
? Data retrieval and data manipulation code
? Interface data model, abstracted from
underlying data model
? Access schema includes:
? Execution rights to data retrieval and
manipulation code in container schema
? Synonyms pointing at relevant code in
container schema
General principles
? Actors connect to database through
access schema
? Actors do not have direct access to data
? Actors do not have direct access to
meta-data
? Actors have to use predefined access
paths in orderly manner
? Actors cannot circumvent data
validation, on purpose or by accident
Usage
? Usually not a major problem
? Might become major performance issue:
? Data-intensive updates
? Large data volumes
? Time critical updates (e.g. quasi-realtime)
? Non-trivial validation
? No possibility to take in ¡°dirty data¡± and
validate later on
? In-database validation inside API usually
faster than external application
Data validation
? Examples:
? Long rollbacks
? Commit waiting for user interaction
? Long transactions on quasi-real-time data
? Culprits:
? Rookie application programmers (been
there, seen that)
? Interactive users
? Insufficient validation of data
Eliminated by API that enforces some basic
rules.
Bad behaviour
? Enforced usage of correct data types
? Example:
WHERE to_char(dateCol,¡¯YYYYMM¡¯) = ¡¯200101¡¯
? SQL statements entirely within API code
? No user-generated statements that cause
poor performance
Controlled access paths
? The API defines own Oracle types for
data retrieval and update
? The API types may or may not
correspond to actual data containers
? Changes in table layout can be hidden
from caller
? Backward compatibility
? No need to rewrite all applications using the
altered table/-s.
Independency from data model
(retrieval and updates)
? Procedures returning strongly typed
cursors
? At all costs avoid SYS_REFCURSOR
? Alternative: use pipelined functions that
return strongly typed data
? Might lead to undesirable side-effects with
WHERE-clauses
? In both above cases: strongly typed
parameters help avoiding full table scans
due to malformed WHERE-clauses
Some implementation details ¨C
data retrieval
? Procedures accept strongly typed
parameters
? Procedures return status code or
message
Some implementation details ¨C
data update
? SOA
? Will not deliver same throughput
? Will lock you in an infrastructure
Alternatives?

More Related Content

Case study- PL-SQL API as data protection mechanism

  • 1. Case Study: PL/SQL API as data protection mechanism in Oracle databases by Ziemowit Jankowski Database Architect
  • 2. Intended use ¡ñ Large databases ¡ñ Intensive updates ¡ñ Heterogeneous environment ¡ñ Many external actors with potentially misbehaved applications or ad-hoc queries ¡ñ Due to enterprise nature impossible to know who is doing what and how ¡ñ Non-existent or insufficient GUI- capabilities for users
  • 3. ¡ñ Consistency in updates (triggers and constraints) ¡ñ Updating/reading in correct context (triggers, VPD) ¡ñ Authorized access (Oracle authorization, roles, VPD, ACL) ¡ñ Protection against misbehaved application: ¡ñ Reading ¡ñ Updating (e.g. improper transaction handling) Data protection ¨C a look at the broader meaning
  • 4. ¡ñ Exposing data structures to external actors: o Access to ¡°more than needed¡± o Dependency on data model o ¡°Funny¡± updates ¡ñ Misbehaved applications and/or users ¡ñ Mal-formed SQL queries o Sub-optimal data access paths ¡ñ Possible need for multiple implementations of business logic A few pitfalls for ¡°classic¡± approach
  • 5. Example scenarios Database (triggers, constraints, VPD etc) App server Valid ation Automated application Other database GUI-based application
  • 6. ? Updating ? Validation of data ? One place, one algorithm ? Complex validation algorithms ? Avoiding bad application / user behaviour ? Updates independent from data model ? Retrieval ? Controlled access paths (performance) ? Validation of search conditions ? Results independent from data model ? Logging ? Security through obfuscation Some benefits of an API in data handling
  • 7. ? Software platform specific: ? .NET ? Java ? Others ? Common platform ? Oracle PL/SQL Different API implementations
  • 8. ? Pros ? One point, one algorithm for all data validation ? All ¡±internal¡± data validation close to data store (overhead elimination) ? No ¡±out of control¡± access paths ? Cons ? PL/SQL based ? Hard to understand for novice programmers Oracle PL/SQL API
  • 9. Example scenarios revisited Container schema (data, metadata, stored code) App server Validation Automated application Other database GUI-based application Access schema Access schema Access schema
  • 10. ? Container schema includes: ? Data containers (tables, views, mviews) incl. triggers, constraints etc. ? Data retrieval and data manipulation code ? Interface data model, abstracted from underlying data model ? Access schema includes: ? Execution rights to data retrieval and manipulation code in container schema ? Synonyms pointing at relevant code in container schema General principles
  • 11. ? Actors connect to database through access schema ? Actors do not have direct access to data ? Actors do not have direct access to meta-data ? Actors have to use predefined access paths in orderly manner ? Actors cannot circumvent data validation, on purpose or by accident Usage
  • 12. ? Usually not a major problem ? Might become major performance issue: ? Data-intensive updates ? Large data volumes ? Time critical updates (e.g. quasi-realtime) ? Non-trivial validation ? No possibility to take in ¡°dirty data¡± and validate later on ? In-database validation inside API usually faster than external application Data validation
  • 13. ? Examples: ? Long rollbacks ? Commit waiting for user interaction ? Long transactions on quasi-real-time data ? Culprits: ? Rookie application programmers (been there, seen that) ? Interactive users ? Insufficient validation of data Eliminated by API that enforces some basic rules. Bad behaviour
  • 14. ? Enforced usage of correct data types ? Example: WHERE to_char(dateCol,¡¯YYYYMM¡¯) = ¡¯200101¡¯ ? SQL statements entirely within API code ? No user-generated statements that cause poor performance Controlled access paths
  • 15. ? The API defines own Oracle types for data retrieval and update ? The API types may or may not correspond to actual data containers ? Changes in table layout can be hidden from caller ? Backward compatibility ? No need to rewrite all applications using the altered table/-s. Independency from data model (retrieval and updates)
  • 16. ? Procedures returning strongly typed cursors ? At all costs avoid SYS_REFCURSOR ? Alternative: use pipelined functions that return strongly typed data ? Might lead to undesirable side-effects with WHERE-clauses ? In both above cases: strongly typed parameters help avoiding full table scans due to malformed WHERE-clauses Some implementation details ¨C data retrieval
  • 17. ? Procedures accept strongly typed parameters ? Procedures return status code or message Some implementation details ¨C data update
  • 18. ? SOA ? Will not deliver same throughput ? Will lock you in an infrastructure Alternatives?