SkyIron is an ASP service run by Perot Systems / Time0 that provides various business services over the Web. In July 2000, SkyIron's main service was SkySearch, a system that indexed some related web-sites and provided a search service specific to that area of interest.
SkySearch's offline indexing was very efficient, limited primarily by network bandwidth. But its online search had inadequate performance, with unacceptable response times.
Prior work had isolated the problem as occurring within the Oracle/Intermedia database, as called from Java. I performed a detailed analysis of that section of the system, writing benchmarking code in Java to reproduce and analyze the performance. I also executed various comparisons to determine nominal performance levels, e.g., comparing Intermedia with normal Oracle indexes. I also investigated typical usage of the search system by the actual users.
Based on my research, I made four specific recommendations:
All these recommendations were implemented by the SkyIron team. This led to much better performance of the system and allowed it to handle a larger number of users.
Example result grouping algorithm that I wrote in PL/SQL:
1 CREATE OR REPLACE PACKAGE BODY rujith AS 2 PROCEDURE get_search_urls (dm_id_t IN NUMBER, 3 Query IN VARCHAR2, 4 category IN VARCHAR2, 5 Start_rec_no IN NUMBER, 6 no_of_records IN NUMBER, 7 req_id OUT NUMBER, 8 d_cursor OUT dtlookup, 9 status OUT NUMBER, 10 total_rows OUT NUMBER, 11 industry_t OUT VARCHAR2) IS 12 total NUMBER; 13 rrr NUMBER; 14 row_no NUMBER; 15 end_rec_no NUMBER; 16 site_id NUMBER; 17 r_no NUMBER; 18 max_rows NUMBER; 19 20 TYPE req_accum IS RECORD 21 (url VARCHAR2(500), 22 summary VARCHAR2(255), 23 title VARCHAR2(255), 24 row_no NUMBER, 25 num_urls NUMBER); 26 each_accum req_accum; 27 i NUMBER; 28 j NUMBER; 29 count_status NUMBER; 30 31 TYPE t_req IS TABLE OF req_accum INDEX BY BINARY_INTEGER; 32 33 table_req t_req; 34 35 res_table doc_result_table; 36 37 BEGIN 38 DECLARE 39 CURSOR doc_cursor IS 40 SELECT /*+ FIRST_ROWS */ SCORE(1), 41 url, summary, title, site_id 42 FROM document 43 WHERE site_id IN 44 (SELECT site_id FROM dm_site WHERE dm_id = dm_id_t) 45 AND CONTAINS (filename, query, 1) > 0 46 ORDER BY score(1) DESC; 47 doc_record doc_cursor%ROWTYPE; 48 49 BEGIN 50 51 SELECT req_seq.nextval INTO rrr FROM dual; 52 req_id := rrr; /* req_id is the output parameter */ 53 SELECT TO_NUMBER(value) INTO max_rows FROM sky_codes 54 WHERE code = 'PG'; 55 56 SELECT industry INTO industry_t FROM dm WHERE dm_id = dm_id_t; 57 58 row_no := 1; 59 end_rec_no := start_rec_no + no_of_records; 60 61 i := 0; 62 r_no := 1; 63 FOR docrec IN doc_cursor LOOP 64 site_id := docrec.site_id; 65 IF (site_id is not null) THEN 66 IF (table_req.exists(site_id)) THEN 67 table_req(site_id).num_urls := 68 table_req(site_id).num_urls +1; 69 ELSE 70 total := table_req.COUNT + 1; 71 table_req(site_id).url := docrec.url; 72 table_req(site_id).summary := docrec.summary; 73 table_req(site_id).title := docrec.title; 74 table_req(site_id).row_no := r_no; 75 table_req(site_id).num_urls := 1; 76 r_no := r_no+1; 77 IF (r_no >= end_rec_no) THEN EXIT; END IF; 78 END IF; 79 END IF; 80 81 IF (row_no = max_rows) THEN EXIT; END IF; 82 83 row_no := row_no + 1; 84 85 END LOOP; 86 87 total := table_req.COUNT; 88 i := table_req.FIRST; 89 90 res_table := doc_result_table(); 91 j := 1; 92 count_status := 0; 93 WHILE i IS NOT NULL LOOP 94 each_accum := table_req(i); 95 IF (each_accum.row_no >= start_rec_no AND 96 each_accum.row_no < end_rec_no) THEN 97 res_table.EXTEND; 98 res_table(j) := 99 doc_result(each_accum.url, each_accum.summary, 100 each_accum.title, i, each_accum.num_urls, 101 each_accum.row_no); 102 count_status := count_status + 1; 103 j := j + 1; 104 END IF; 105 i := table_req.next(i); 106 END LOOP; 107 108 total_rows := total; 109 110 status := count_status; 111 112 IF doc_cursor%ISOPEN THEN CLOSE doc_cursor; END IF; 113 114 OPEN d_cursor FOR 115 SELECT /*+ FIRST ROWS */ url, summary, title, 116 site_id, num_urls, row_no 117 FROM TABLE(CAST(res_table AS doc_result_table)) 118 ORDER BY row_no; 119 120 END; 121 END get_search_urls; 122 END rujith;