|Oracle® Enterprise Data Quality for Product Data Oracle DataLens Server Administration Guide
Part Number E23614-02
This appendix describes steps that can be taken to improve the throughput of the servers. The emphasis is on running DSA jobs as fast as possible.
The most accurate way to check the timing is to place a timer around the calls to run the DSA.
Another way is to look at the results of the job in the Administration Web pages and check the duration of the job as follows:
This cannot be taken advantage until there are two or more Oracle DataLens Servers in a single Server Group. The Oracle DataLens Server group will provide automatic load balancing and fail-over for all servers within a particular server group.
When running the application, be certain to call one of these production servers in the Server Group and not call the Admin server.
Manual load balancing can be performed for the servers in a single Server Group by selecting which data lenses are loaded by each server. Additionally, servers can be set to load DSA on a server-by-server basis. It is recommended that each server be setup with all the data lenses and DSAs and allow the Oracle DataLens Server to control the load balancing internally.
This is turned off by default. Tracing is only turned on by Oracle Consulting Services to trace information flow in the system. This can be turned off in the Options menu of the Administration Web Pages. Additionally, there are a set of
scs.trace.network flags that should be omitted or set to
false in the
server.cfg configuration file.
Each step in a DSA incurs additional overhead. This is because there is job information stored in the RDBMS repository for each the step of a DSA. Additionally there is overhead to package-up and ship the SOAP data contents from the DSA to each step during processing. What this means is that simplifying the DSA structure and placing as much of the process flow inside of Decision Maps will improve the speed of execution. We have observed timing improvements of up to .2 seconds for each DSA step that is replaced with a Decision Map.
Ultra-high priority jobs are supported. These DSA jobs do not store the step information in the RDBMS repository. The overhead of job execution is eliminated at the expense of job information and details of completed jobs. Especially for single-line jobs, ultra-high priority makes sense because the job execution will be as fast as possible and job details on thousands of single-line jobs will just clog up the DSA Job status Administration Web pages.
The rule is that huge jobs should be run with a low priority, giving processing cycles to smaller medium and tiny high priority jobs. DSA jobs with a small number of input records and jobs where the user is waiting for a response need to be run at a high priority to get the fastest response time.
By default, when a DSA is being processed by the Oracle DataLens Server, all data will be held in memory, unless there are more than 5000 records being processed in a single DSA job. The speed of execution of these large jobs can be increased by setting the number of data records that are held in memory between these processing steps. This is controlled in the Oracle DataLens Server.cfg file with the following line:
Individual data lenses can cache parsing rules in memory for re-use without re-loading the rule each time. This is mostly useful for data processing by data lenses that reuse the same data repeatedly. Examples of this would be manufacturer names, redundant data, part numbers that are reused often. Data lenses that are not a good candidate are those that process things like descriptions that are different each time and would require a different parse tree for each line.
The cache should be large enough that the most often repeated lines are allowed to stay in memory (using a LRU Queue where the least often used rules will drop out of memory). For instance if there are 300 manufacturer names that are often reused among several thousand names, then the cache should be set to 1000 or perhaps 2000 depending on the frequency of use, to ensure that the 300 most often used names continue to reside in memory.
This is change is required for each data lens that need the caching.
Check out the data lens to the client
Go to the C:\Datalens\Applications\data\cbidwell\project\CablesF\config directory
Edit the Project.xml file and modify the following line to the cache size
Save and check-in the project after making this change.
When running in a production environment, the number of data lenses is controlled by the lenses that are deployed to Production. Do not deploy data lenses to Production if they are not going to be used for actual production DSA jobs.
Fine-tuning of which data lens are used by a particular server can be controlled by setting the particular data lenses that are loaded by a particular Production Oracle DataLens Server.
Set the number of parameterized domain instances that will be loaded into memory. A single domain with two instances should set instances to three to maximize performance when using these domains.
One for the first parameterized domain
Another for the second
A third for both in memory
This is set in the
server.cfg file as follows:
See the section Tune memory usage on the servers for information on memory limitations of Windows servers.
Linux and Unix running on 64-bit hardware does not have the 1.6 GB memory limitation for Java Web Server that we have observed on 32-bit Microsoft Windows servers. Windows 64-bit servers do not have this memory limitation either.
Important:In an Enterprise DQ for Product production environment, only run on a 64-bit server running a 64-bit installation of Java. Never try to run a production environment on any 32-bit servers.
In database-intensive DSAs, major performance improvements can be made by tuning the database DDL statements. Simple things like indexing fields that are being searched on and reducing the number of tables in computationally intensive SQL joins can be very effective in improving the performance of the DSAs.
These tuning tasks are very dependent on the particular database schema and would need to be examined by a database professional or Oracle Consulting Services.