SkyIron Performance Analysis

SkyIron is an ASP service run by Perot Systems / Time0 that provides various business services over the Web. In July 2000, SkyIron's main service was SkySearch, a system that indexed some related web-sites and provided a search service specific to that area of interest.

SkySearch's offline indexing was very efficient, limited primarily by network bandwidth. But its online search had inadequate performance, with unacceptable response times.

Prior work had isolated the problem as occurring within the Oracle/Intermedia database, as called from Java. I performed a detailed analysis of that section of the system, writing benchmarking code in Java to reproduce and analyze the performance. I also executed various comparisons to determine nominal performance levels, e.g., comparing Intermedia with normal Oracle indexes. I also investigated typical usage of the search system by the actual users.

Based on my research, I made four specific recommendations:

All these recommendations were implemented by the SkyIron team. This led to much better performance of the system and allowed it to handle a larger number of users.

Example result grouping algorithm that I wrote in PL/SQL:

     1  CREATE OR REPLACE PACKAGE BODY rujith AS
     2    PROCEDURE get_search_urls (dm_id_t IN NUMBER,
     3                               Query IN VARCHAR2,
     4                               category IN VARCHAR2,
     5                               Start_rec_no IN NUMBER,
     6                               no_of_records    IN NUMBER,
     7                               req_id   OUT NUMBER,
     8                               d_cursor OUT dtlookup,
     9                               status   OUT NUMBER,
    10                               total_rows   OUT NUMBER,
    11                               industry_t   OUT VARCHAR2) IS
    12    total   NUMBER;
    13    rrr     NUMBER;
    14    row_no  NUMBER;
    15    end_rec_no  NUMBER;
    16    site_id NUMBER;
    17    r_no    NUMBER;
    18    max_rows NUMBER;
    19  
    20    TYPE req_accum IS RECORD
    21     (url       VARCHAR2(500),
    22      summary   VARCHAR2(255),
    23      title     VARCHAR2(255),
    24      row_no    NUMBER,
    25      num_urls  NUMBER);
    26    each_accum  req_accum;
    27    i   NUMBER;
    28    j   NUMBER;
    29    count_status    NUMBER;
    30  
    31    TYPE t_req IS TABLE OF req_accum INDEX BY BINARY_INTEGER;
    32  
    33    table_req   t_req;
    34  
    35    res_table   doc_result_table;
    36  
    37    BEGIN
    38      DECLARE
    39        CURSOR doc_cursor IS
    40          SELECT /*+ FIRST_ROWS */  SCORE(1),
    41            url, summary, title, site_id
    42          FROM document
    43          WHERE site_id IN
    44          (SELECT site_id FROM dm_site WHERE dm_id = dm_id_t)
    45          AND CONTAINS (filename, query, 1) > 0
    46          ORDER BY score(1) DESC;
    47        doc_record doc_cursor%ROWTYPE;
    48  
    49      BEGIN
    50  
    51        SELECT req_seq.nextval INTO rrr FROM dual;
    52        req_id := rrr; /* req_id is the output parameter */
    53        SELECT TO_NUMBER(value) INTO max_rows FROM sky_codes
    54        WHERE code = 'PG';
    55  
    56        SELECT industry INTO industry_t FROM dm WHERE dm_id = dm_id_t;
    57  
    58        row_no := 1;
    59        end_rec_no := start_rec_no + no_of_records;
    60  
    61        i := 0;
    62        r_no := 1;
    63        FOR docrec IN doc_cursor LOOP
    64          site_id := docrec.site_id;
    65          IF (site_id is not null) THEN
    66            IF (table_req.exists(site_id)) THEN
    67              table_req(site_id).num_urls :=
    68                table_req(site_id).num_urls +1;
    69            ELSE
    70              total := table_req.COUNT + 1;
    71              table_req(site_id).url := docrec.url;
    72              table_req(site_id).summary := docrec.summary;
    73              table_req(site_id).title := docrec.title;
    74              table_req(site_id).row_no := r_no;
    75              table_req(site_id).num_urls := 1;
    76              r_no := r_no+1;
    77              IF (r_no >= end_rec_no) THEN EXIT; END IF;
    78            END IF;
    79          END IF;
    80  
    81          IF (row_no = max_rows) THEN EXIT; END IF;
    82  
    83          row_no := row_no + 1;
    84  
    85        END LOOP;
    86  
    87        total := table_req.COUNT;
    88        i := table_req.FIRST;
    89  
    90        res_table := doc_result_table();
    91        j := 1;
    92        count_status := 0;
    93        WHILE i IS NOT NULL LOOP
    94          each_accum := table_req(i);          
    95          IF (each_accum.row_no >= start_rec_no AND
    96            each_accum.row_no < end_rec_no) THEN
    97            res_table.EXTEND;
    98            res_table(j) :=
    99              doc_result(each_accum.url, each_accum.summary,
   100                each_accum.title, i, each_accum.num_urls,
   101                each_accum.row_no);
   102            count_status := count_status + 1;
   103            j := j + 1;
   104          END IF;
   105          i := table_req.next(i);
   106        END LOOP;
   107  
   108        total_rows := total;
   109  
   110        status := count_status;
   111  
   112        IF doc_cursor%ISOPEN THEN CLOSE doc_cursor; END IF;
   113  
   114        OPEN d_cursor FOR
   115          SELECT /*+ FIRST ROWS */ url, summary, title,
   116              site_id, num_urls, row_no
   117          FROM TABLE(CAST(res_table AS doc_result_table))
   118          ORDER BY row_no;
   119  
   120      END;
   121    END get_search_urls;
   122  END rujith;

Rujith de Silva
Created 2001-02-05; modified 2001-02-05