1. Introduction
2. Sample agent files
3. Setting up the Sample Crawler Agent
Oracle Ultra Search provides a sample implementation of user defined data source agent which uses the Ultra Search agent API. The purpose of this sample is to provide a concrete illustration on the usage of the agent APIs.
Upon invocation this sample agent connects to a specified Oracle database and retrieve the contents of a table for the crawler to collect and index.
The sample agent are fully functional and can be customized to adapt to other database based data source. This agent performs the following task:
- Reading of data source parameters
- Connection to the database that contains the data source
- Initialization of fetching document URL and attributes from the data source
- Fetching of document URL and attributes from the data source
- Disconnection from the data source
Complete and comprehensive documentation on agent API is provided in the following document:
- Ultra Search Crawler agent API
The sample agent files are located in the $ORACLE_HOME/ultrasearch/sample directory. You can directly view the sample agent source code using your preferred text editor.
The following table list and describe all sample agent files:
| File | Description |
| sample_agent_readme.html | This file |
| SampleAgent.java | Sample crawler agent implementation using agent APIs |
3.1 Compile and build agent jar file
The java source code for the sample agent must be first compiled into class files and put into a jar file under $ORACLE_HOME/ultrasearch/lib/agent/ directory. The classes needed for compilation are the JDK class (classes.zip), Oracle JDBC thin driver (classes12.zip), and ultraserach.jar. For example,
javac -J-ms16m -J-mx96m -O -classpath /jdk1.2.2_05/lib/classes.zip:/lib/classes12.zip: $ORACLE_HOME/ultrasearch/lib/ultrasearch.jar SampleAgent.javaTo build the sampleAgent.jar file:
/jdk1.2.2_05/bin/jar cv0f /oracle/ultrasearch/lib/agent/sampleAgent.jar SampleAgent.class 'SampleAgent$DocNode.class'3.2 Creating a data source type
A data source type that uses the sample agent must be created first.
- Name - URL Table Type
- Description - Table with Rows of URLs
- Agent Name - SampleAgent
- Agent Jar File - sampleagent
3.3 Defining data source parameters
Parameter that defines a data source type are defined.
- Database Connect String (DB connection)
- User Name (schema owner of the URL table)
- Password (schema owner password, encrypted)
- Table Name (URL table name)
- URL Column (Column holding doc URLs)
- Ignore Flag Column (1 for ignoring, 0 otherwise)
- Language Column (Document Language)
- Attribute List (List of column for attributes)
It is in the following format: [column name/attribute name] [column name/attribute name] ... where is 0 for number, 1 for string, and 2 for date. For example, if the document has 4 attributes: Company Name, Category, Revenue, S&P Rating then it is specified as: [Company Name/Company/1][Category/Classification/1][Revenue/Revenue/0][Rating/Alalyst Rating/1] - Log File Name (log file)
- Log Directory (Location of log file)
3.4 Defining a data source of this type
A data source is defined which initialize the data source parameters. As an example, The value specified here is used to access a table whose schema is:
TABLE NEWS ( ARTICLE_NO NUMBER, NEWS_URL VARCHAR2(740), TITLE VARCHAR2(200), AUTHOR VARCHAR2(100), PUB_DATE DATE default SYSDATE, PUBLISHER VARCHAR2(100), PRICE NUMBER, LANG VARCHAR2(10), IGNORE NUMBER DEFAULT 0, PRIMARY KEY (NEWS_URL) );
- Database Connect String - dlsun1710:5521:search
- User Name - SCOTT
- Password - TIGER
- Table Name - NEWS
- URL Column - NEWS_URL
- Ignore Flag Column - IGNORE
- Language Column - LANG
- Attribute List - [ARTICLE_NO/Article Number/0][TITLE/Article Title/1][AUTHOR/Author/1][PUB_DATE/Report Date/2][PUBLISHER/Newspaper/1][PRICE/Download Cost/0]
- Log File Name - testagent.log
- Log Directory - /tmp/ultrasearch/