SkyIron is an ASP service run by Perot Systems / Time0 that provides various business services over the Web. In July 2000, SkyIron's main service was SkySearch, a system that indexed some related web-sites and provided a search service specific to that area of interest.
SkySearch's offline indexing was very efficient, limited primarily by network bandwidth. But its online search had inadequate performance, with unacceptable response times.
Prior work had isolated the problem as occurring within the Oracle/Intermedia database, as called from Java. I performed a detailed analysis of that section of the system, writing benchmarking code in Java to reproduce and analyze the performance. I also executed various comparisons to determine nominal performance levels, e.g., comparing Intermedia with normal Oracle indexes. I also investigated typical usage of the search system by the actual users.
Based on my research, I made four specific recommendations:
All these recommendations were implemented by the SkyIron team. This led to much better performance of the system and allowed it to handle a larger number of users.
Example result grouping algorithm that I wrote in PL/SQL:
1 CREATE OR REPLACE PACKAGE BODY rujith AS
2 PROCEDURE get_search_urls (dm_id_t IN NUMBER,
3 Query IN VARCHAR2,
4 category IN VARCHAR2,
5 Start_rec_no IN NUMBER,
6 no_of_records IN NUMBER,
7 req_id OUT NUMBER,
8 d_cursor OUT dtlookup,
9 status OUT NUMBER,
10 total_rows OUT NUMBER,
11 industry_t OUT VARCHAR2) IS
12 total NUMBER;
13 rrr NUMBER;
14 row_no NUMBER;
15 end_rec_no NUMBER;
16 site_id NUMBER;
17 r_no NUMBER;
18 max_rows NUMBER;
19
20 TYPE req_accum IS RECORD
21 (url VARCHAR2(500),
22 summary VARCHAR2(255),
23 title VARCHAR2(255),
24 row_no NUMBER,
25 num_urls NUMBER);
26 each_accum req_accum;
27 i NUMBER;
28 j NUMBER;
29 count_status NUMBER;
30
31 TYPE t_req IS TABLE OF req_accum INDEX BY BINARY_INTEGER;
32
33 table_req t_req;
34
35 res_table doc_result_table;
36
37 BEGIN
38 DECLARE
39 CURSOR doc_cursor IS
40 SELECT /*+ FIRST_ROWS */ SCORE(1),
41 url, summary, title, site_id
42 FROM document
43 WHERE site_id IN
44 (SELECT site_id FROM dm_site WHERE dm_id = dm_id_t)
45 AND CONTAINS (filename, query, 1) > 0
46 ORDER BY score(1) DESC;
47 doc_record doc_cursor%ROWTYPE;
48
49 BEGIN
50
51 SELECT req_seq.nextval INTO rrr FROM dual;
52 req_id := rrr; /* req_id is the output parameter */
53 SELECT TO_NUMBER(value) INTO max_rows FROM sky_codes
54 WHERE code = 'PG';
55
56 SELECT industry INTO industry_t FROM dm WHERE dm_id = dm_id_t;
57
58 row_no := 1;
59 end_rec_no := start_rec_no + no_of_records;
60
61 i := 0;
62 r_no := 1;
63 FOR docrec IN doc_cursor LOOP
64 site_id := docrec.site_id;
65 IF (site_id is not null) THEN
66 IF (table_req.exists(site_id)) THEN
67 table_req(site_id).num_urls :=
68 table_req(site_id).num_urls +1;
69 ELSE
70 total := table_req.COUNT + 1;
71 table_req(site_id).url := docrec.url;
72 table_req(site_id).summary := docrec.summary;
73 table_req(site_id).title := docrec.title;
74 table_req(site_id).row_no := r_no;
75 table_req(site_id).num_urls := 1;
76 r_no := r_no+1;
77 IF (r_no >= end_rec_no) THEN EXIT; END IF;
78 END IF;
79 END IF;
80
81 IF (row_no = max_rows) THEN EXIT; END IF;
82
83 row_no := row_no + 1;
84
85 END LOOP;
86
87 total := table_req.COUNT;
88 i := table_req.FIRST;
89
90 res_table := doc_result_table();
91 j := 1;
92 count_status := 0;
93 WHILE i IS NOT NULL LOOP
94 each_accum := table_req(i);
95 IF (each_accum.row_no >= start_rec_no AND
96 each_accum.row_no < end_rec_no) THEN
97 res_table.EXTEND;
98 res_table(j) :=
99 doc_result(each_accum.url, each_accum.summary,
100 each_accum.title, i, each_accum.num_urls,
101 each_accum.row_no);
102 count_status := count_status + 1;
103 j := j + 1;
104 END IF;
105 i := table_req.next(i);
106 END LOOP;
107
108 total_rows := total;
109
110 status := count_status;
111
112 IF doc_cursor%ISOPEN THEN CLOSE doc_cursor; END IF;
113
114 OPEN d_cursor FOR
115 SELECT /*+ FIRST ROWS */ url, summary, title,
116 site_id, num_urls, row_no
117 FROM TABLE(CAST(res_table AS doc_result_table))
118 ORDER BY row_no;
119
120 END;
121 END get_search_urls;
122 END rujith;