Journal reference: Computer Networks and ISDN Systems, Volume 28, issues 711, p. 1523.
G. Jason Mathews and Barry E. Jacobs
This paper addresses the electronic peer review process problem. Namely, how does one electronically manage the complex process of peer reviewing papers over a physically distributed set of participants (i.e., authors, reviewers, and administrators). The peer review of papers submitted to the International World Wide Web (WWW) Conference Committee (IW3C2) would serve as an example of this process.
The peer review process (also called refereeing) can be thought of as a logistical problem that is relatively simple in concept. The peer review committee gets submissions, categorizes them, sends them to reviewers, collects reviews, makes final selections, and notifies submitters of the outcome. The process, however, gets complex in the implementation because of the scale involved and the geographic diversity of the program participants. There are several related activities in the literature [1, 2, 3].
Our solution to electronically managing the peer review process applies the methodology called Electronic Management Systems (EMS). In particular, we represent the entire peer review process as a virtual organization that uses the Web as a communications vehicle. The menus of the Web pages represent electronic processes with examples such as peer review processes and user management processes. Key components of EMSs are forms, report tools, and databases.
Many EMSs have been implemented by the National Aeronautics and Space Administration (NASA) to provide a "paperless" capability  that facilitates the operation of complex processes, such as the peer review process. Two such systems handled the peer review processes of the Fourth and Fifth International WWW Conferences (WWW4  and WWW5 ) that accepted papers worldwide and allowed the conference committee members and reviewers to examine and evaluate them from a WWW forms-based interface to an Oracle Relational Database Management System (RDBMS). There are also related research projects that have developed WWW interfaces to Oracle [7, 8].
The approach taken in developing these systems extends the electronic peer review process research in several ways. First, the development of the entire system requires little or no programming. Second, this is the first totally Web-based approach for modeling the complete peer review process. Third, we represent all the subprocesses of the processes in terms of HTML forms and report tools on databases. Fourth, we provide a common set of report tools dynamically generated from database schema that will work for "all" databases, hence, processes.
The conference peer review EMSs (WWW4 EMS and WWW5 EMS) were designed for three types of users: 1) authors who submit their paper to the conference; 2) reviewers who submit evaluations of papers assigned to them; and 3) conference committee members who assign papers to reviewers and make the decision to accept or reject a paper.
The WWW4 EMS and WWW5 EMS are similarly designed, however, the WWW5 EMS reorganized the processes in a more intuitive layout. Throughout this paper the term "EMS" will generally refer to both systems except when indicated otherwise.
The EMS is a collection of hyperlinked menus, sub-menus, and forms that traverse a hierarchy of functions belonging to steps that define a process. Under this top layer is a relational database of four tables (submissions, assignments, users, and suggestions), and an assortment of Common Gateway Interface (CGI) scripts and auxiliary programs that access the database. The top-level menu structure of the EMS provides links to the following processes of the system:
An overview HTML document is provided at each level of the menu hierarchy to give an overview of the available process steps and options.
The Bulletin Board provides quick access and communication of information. For example, from the Bulletin Board page there is a menu item that links to the Reviewers Bulletin Board and another to the Administrator Bulletin Board. The purpose of the Reviewers Bulletin Board is to provide a user interface to reviewer information and on-line forms to enter evaluations. The purpose of the Administrator Bulletin Board is to provide a user interface to forms and information about the entire system, which is edited by the administrators for a hotlist of links into various parts of the system.
Peer Review Processes
The Peer Review Processes provide a facility for managing all of the peer review processes of the conference, which is discussed in detail later.
Users/Internal Staff Processes
The users menu allows the administrator to create new users and update/delete existing users. The users include administrators (EMS staff and conference committee members) and reviewers. Associated with every user is a user name and password through which the httpd server authenticates the user for various functions of the system. Thus, reviewers can enter only reviews of papers assigned to them; the administrator can access all the data; and people without authorization (no user name or password) can use only the public access forms to submit a paper.
The Suggestions Processes provide a facility for processing and tracking suggestions for improvement from its users. The menu provides an on-line form for users to enter suggestions and comments about the EMS, and another on-line form for administrators to respond to the suggestions.
The Report Tools provide a facility for making available tools for generating reports across all the databases corresponding to all the above processes. Each level of the EMS provides a collection of report tools that report on the associated database with that level. For example, the report tools submenu at the Users/Internal Staff Processes menu provides a report of all users of the system (including user ID, full name, and E-mail address), which is an embedded Structured Query Language (SQL) statement as input to a CGI program that queries the Oracle database and formats the output in HTML.
In this section we discuss the peer review processes, which represent the largest and most important part of the system. These processes involve soliciting potential authors for submissions and managing the incoming submissions as well as the evaluations of them. These aspects are classified as Initiation, Review, Rankings and Selections, Announcements and Debriefings where each process is further defined with the following subprocesses:
Call for Papers and Intent to Submit
After the conference committee announces the call for papers, authors are invited to fill in an intent-to-submit form that, when submitted to the server, executes a CGI script that assigns a unique paper ID number to the prospective author and sends mail to the author with this number as well as instructions for submitting a paper to the conference. This information is also used early in the process by the conference committee to assign papers to reviewers based on the proposed paper title and topic.
The authors write their papers, mark them up in HTML under the guidelines of the conference, and archive their papers into single file archives using tar, PKZIP, or the equivalent program on their platform. The papers are uploaded as binary files to the system via File Transfer Protocol (FTP) and are processed when a CGI script is submitted. The author connects to the conference FTP server via an FTP client and puts the submitted file into the specified incoming directory. The incoming directory is write-only and allows authors to write files only but not to read other uploaded unprocessed submissions as some submissions may contain proprietary information not yet released to the general public. An author then registers the submission with an HTML form in which the user enters information about the paper (submission ID number, title, conference topic, related keywords, uploaded file name) and the authors (name, E-mail address, mailing address, telephone, etc.). When the form is submitted to the server, the corresponding CGI script validates the information and if valid will perform the following steps:
Problems with the Submissions Process
These few steps introduced the first area of problems in that the instructions were not sufficiently clear and as a result authors inadvertently uploaded incomplete or corrupted submissions into the system. The first problem occurred during the transmission of the author's submission file to the server via FTP. One author was unable to connect to the FTP server because of an impossibly slow band-width transmission between the United States and the author's local network connection in Europe. This author had to UUENCODE the submission (i.e., convert the binary file to ASCII) and use E-mail to submit it. Several other authors uploaded their submissions (probably from a Macintosh or PC) and transferred the files using FTP in ASCII mode, a process that translated carriage return and line-feed characters, resulting in a corrupted archive. Some authors submitted a multifile HTML document created using LaTeX2HTML , which made it difficult for a reviewer to view or print the entire paper quickly. Other authors submitted multiple files (no archive) where it was not apparent which files belonged together. Furthermore, some of the submissions did not comply to the HTML 2.0 standard, so some browsers viewed them differently and other submissions had links to files that worked on the author's Web server, MS-Windows for example, but did not work under the UNIX server hosting the EMS, which has a case-sensitive file system and the file referenced as the HTML inline image in <IMG SRC="figure1.gif"> differs from the file referenced in <IMG SRC="FIGURE1.GIF">.
The next problem occurred when submissions were uploaded to the system correctly but were left unprocessed when the author neglected to submit the final HTML form to acknowledge upload, trigger the CGI script to enter the submission information into the database, and move the submission into the processed area for reviewing to commence. Without this final step the submission does not exist in the eyes of the reviewer. There needs to be a better mechanism that integrates safe and anonymous FTP with the WWW to upload files, where authors enter file names on their local computer systems via a file dialog and the WWW browser uploads the files to the server. It appears that Netscape is addressing this problem with its new Netscape Navigator 2.0 browser with the added "file" input type for HTTP file upload, but it will not be supported from all browsers and a solution across all browsers must be found.
Some of these problems resulted from the conference committee not anticipating ALL the possible ways to upload a file, multiple file formats, HTML layouts, etc., while others resulted from authors not following the instructions. With an electronic layer between the reviewers, authors, and conference committee, it is not always clear to provide all information necessary for every situation especially when people may make assumptions of what is expected of them and others. A simple user interface and a clear set of instructions are best with extenuating circumstances handled on a case-by-case basis with correspondence between the author and conference committee. The author's interface to the EMS has changed little between the Fourth and Fifth WWW Conferences, but the author's instructions and the underlying CGI scripts have changed a great deal. The original instructions for WWW4 were about two printed pages in length while the revised instructions for WWW5 amounts to five printed pages outlining each step with troubleshooting information. With each iteration of the WWW Conference Series and the growth of the underlying WWW infrastructure (HTTP, HTML, servers, and browsers), all of these problems are being handled quickly as we learn how to exploit this ever-changing medium.
Examination of Submissions
The administrator has the ability to preview the information entered about the submission and the paper itself before even assigning it to reviewers. For example, some submissions not meeting the acceptable guidelines may be deleted from the database and removed from further discussion, but this has yet to happen. There is access to the on-line papers from this level with a link to the top level directory containing the unpacked papers, each of which is located in a subdirectory corresponding to the submission ID number.
Assignment of Submission Responsibilities to Reviewers
Once the first set of papers is submitted, then the conference committee must decide which reviewer must evaluate it by matching papers to reviewers best able to evaluate them. There are several HTML forms from which to enter assignments by either managing the submissions for a particular reviewer or managing the reviewers for a given submission. There are report tools to generate various reports such as listing those submissions that have not been assigned to enough reviewers since every paper must be reviewed by at least two reviewers.
Once papers are assigned, reviewers will have access to these papers and be able to enter the grading information. The reviewer gets a list of papers, and when a paper is selected, an evaluation form is filled in with the information about that paper. The reviewer grades the paper according to several criteria (relevance, originality, correctness, and quality) on a scale from 1 to 5 where 1 is the poorest and 5 is the best score. In addition to these numeric scores the reviewer must also include private comments about the paper to understand later, for example, why relevance for a paper was a "4" and not a "5". These comments are important in the refereeing process to select borderline papers that may or may not be accepted. There is also a comments field to suggest modifications needed for the paper to be accepted, which are mailed to the author along with the scores. The committee may examine a single review or examine all reviews for a given paper to ensure that the scores between two reviewers do not differ by more than one point in which case the committee must confer with the reviewers to re-evaluate their decisions and bridge the gap between the scores.
The private comments and modifications to authors' comments were initially stored within the Oracle database assignments table, but problems surfaced owing to Oracle's limitations on string data types within SQL statements. A VARCHAR2 datatype (a variable character type) cannot exceed 2,000 characters, and the reviewers entering long reviews (> 2,000 characters) resulted in some comments being truncated. The WWW5 EMS stores the comments in external ASCII files that are outside the Oracle database but accessible from the EMS's forms interface as well as directly readable from the file system. Within the database the file name of the comments is stored and a special flag within the form triggers the CGI program to read the contents of the file and insert this into the form's TEXTAREA field for editing.
In addition to scoring each paper there is another step to rank each paper within a given topic to help compare the papers and identify the best or poorest papers. For example, there were 18 papers submitted to WWW4 belonging to the Authoring Environments topic. Of these 18 papers only four papers were accepted and 14 were rejected, so if they were ordered from one to 18, then the top four could have been selected once the ranking was defined, but ranking the papers is more difficult than grading them. It requires not only deciding whether a paper is good or bad but also how it rates to all other papers within the topic. For a topic such as Charging and Payment Protocols with only three submissions, it may not be difficult, but for the Authoring Environments topic it would take too much time. Therefore, this recommended step in the peer review process is not always completed.
After careful deliberation of the reviewers' evaluations (grades and comments) and topic rankings, the committee decides whether to accept or reject each submission.
Bulletin Board Upload
This step provides instructions for the committee to generate a final report of the decisions to accept and reject papers based on the evaluations. The report is created from a predefined SQL statement, copied into the Bulletin Board area, and linked from the main Bulletin Board. From there all reviewers and committee members can examine this report to determine whether any last minute re-evaluation is needed for any particular submission.
Posting Notifications of Selections
This step provides instructions for the conference committee to gather the submission information and evaluations, compose debriefing letters of acceptance or rejection to the appropriate authors with the reasons for rejection or the needed modifications for a final revision of accepted papers, and electronically mail out the letters to the E-mail addresses of the designated authors. The authors of accepted papers are notified to submit final revision of their papers, which are uploaded in the same manner as the original paper through FTP followed by submitting a form to ingest it.
Each process step has a deadline and when the selections are mailed out and the final papers uploaded to the conference server, then the peer review process is officially over. Then the conference committee query the databases and use the report tools to generate various text or graphics reports on the submissions and evaluations for analyzing what had happened and to improve the next conference. For example, some conference topics may not have been picked by any or just a few authors. This situation means that the topic may not be relevant to the community or the wording of it was not clear. For WWW4, the topic Dealing with Imprecise, Uncertain or Inconsistent Data was not selected by any authors, so this topic was subsequently dropped from the WWW5 topics.
The heart of the user interface is a "multifunction" form that interfaces with a CGI program for accessing the database. Each multifunction form provides a single interface to perform multiple database operations on a particular database table or view. The layout of the output and the internals of the database table are embedded within the HTML form, so each form may have different database tables with different columns, but the underlying CGI program serves all of the basic database operations from one executable to query the database, update existing records, and create new records. The form is a template that is read by the CGI script to format the output by filling in the corresponding fields in the form with the values from the database.
The interface provides the following seven basic database operations:
Each form will have all or a subset of these seven operations. A sample database query form is illustrated in Figure 1 with the corresponding HTML equivalent shown in Figure 2. Hidden within the HTML form are fields specifying what database is used (Oracle in this case, where Sybase is also supported), what is the form name, primary table, and other information to identify the database table. The column names of the database table are inserted into the form with the NAME tag using the syntax column name.data type.database tablename[.optional flags]. For example, the submission_id column name, which follows the "Paper ID #" text in the form, is a character type designated with the "c" data type from the submissions database table, and the flags "pm" indicate that it is a primary key (p) of the table and the value must be filled-in when modifying the database (m). This form is used as a template by the CGI program, which fills in the appropriate values depending on the HTML context. In the case of an INPUT element, the VALUE="" attribute is inserted into the form, and the value of the database replaces whatever text (if any) is between the double quotes. For a SELECT element such as the decision column at the bottom of the form, the SELECTED attribute is inserted into the OPTION element that matches the value in the database.
Figure 1: Sample query form for submissions
<HTML> <HEAD> <TITLE>WWW95: Manage Submissions Form</TITLE> <BASE HREF="http://.../manage_submissions/"> </HEAD> <BODY> <H1>Manage Submissions Form</H1> <H3>(<A HREF="Overview.html">Overview</A> / <A HREF="Relevant_Data.html">Relevant Data</A> / <A HREF="Report_Tools.html">Report Tools</A>)</H3> <HR> <FORM METHOD="POST" ACTION="/cgi-bin/dbtool.cgi"> <H2>Choose An Operation</H2> <SELECT NAME="operation"> <OPTION VALUE="AND"> And-Search - After filling in column values <OPTION VALUE="OR"> Or-Search - After filling in column values <OPTION VALUE="KEY"> Key-Search - After filling in paper ID # <OPTION VALUE="INSERT"> Create Submission - After filling in column values <OPTION VALUE="UPDATE"> Update Submission - After changing appropriate values <OPTION VALUE="DELETE"> Delete Submission - After filling in column values <OPTION VALUE="CLEAR"> Clear Form - Clears all values </SELECT> <A HREF="Instructions.html#operation"><IMG SRC="/Images/hlp_button.gif" ALT="?"></A> <HR> <INPUT TYPE="submit" VALUE="Submit"> <INPUT TYPE="reset" VALUE="Reset"> <HR> <H2>Fill In Key Values:</H2> <INPUT TYPE="hidden" NAME="dbms" VALUE="oracle"> <INPUT TYPE="hidden" NAME="dbms_table_name" VALUE="submissions"> <INPUT TYPE="hidden" NAME="primary_table" VALUE="submissions"> <INPUT TYPE="hidden" NAME="form_name" VALUE="submissions_form.html"> Paper ID #: <INPUT name="submission_id.c.submissions.pm"><BR> Submission date: <INPUT name="submission_date.d.submissions"><BR> ... List of keywords (as on title page): <INPUT NAME="keywords.c.submissions"><BR> <BR> Type of Presentation: <INPUT TYPE="radio" NAME="type.c.submission.r" VALUE="P"> Technical Paper <INPUT TYPE="radio" NAME="type.c.submission.r" VALUE ="R"> State-of-the-Art Report ... Decision: <SELECT NAME="decision.c.submissions"> <OPTION VALUE="NULL">Not defined <OPTION VALUE="1">Accept <OPTION VALUE="2">Reject <OPTION VALUE="3">Propose as poster </SELECT> <A HREF="Instructions.html#decision"><IMG SRC="/Images/hlp_button.gif" ALT="?"></A> </FORM> </BODY> </HTML>
Figure 2: HTML corresponding to the form in Figure 1
Performing an And-search operation without selecting any values in the form will return all submissions (198 for WWW4 and 211 for WWW5). Entering the keyword as "database" for a search operation will match 14 submissions from the WWW4 EMS. The SQL statement generated for this operation is the following:
select unique submission_id, title from submissions where upper(keywords) like upper('%database%');
This SQL statement will extract the submission_id and title fields for all entries from the submissions table such that the string "database" is contained within the keyword field. The output of this operation, formatted within an HTML form, is displayed in Figure 3 below.
Figure 3: Search list of database related submissions
Selecting submission number 112 from the list above and pressing "submit" will query the database for that entry and fill in the form with the values of that entry as shown in Figure 4.
Figure 4: Sample query form filled in for submission #112
Going back to the blank query form in Figure 1 and selecting the keyword as "database" and the overall decision as "accepted" with a numeric value of 1 will result in a list of two matches (submission number's 112 and 282). This operation will build the following SQL statement for matching both conditions:
select unique submission_id, title from submissions where upper(keywords) like upper('%database%') and decision = '1';More advanced searches, for example, can query for any submissions with either VRML or JAVA in the list of keywords by selecting the Or-Search operation and entering the string "VRML, JAVA" into the keywords field. The comma operator delimits a list of keys or substrings that are added to the search parameters. Any combination of such searches can be entered.
Security is handled using basic user authentication (httpd, htaccess, htpasswd, htgroup)  where authorized users are assigned user names and passwords and must authenticate themselves with their user name and password before they are able to access protected forms or CGI scripts. Most users are designated as reviewers with limited access (update reviews, examine submissions, etc.), and some users are designated as administrators with access to the entire system and the ability to create new users. The .htgroup access file defines these two types of user groups: www-admin for administrators and www-users for normal users (i.e., reviewers). The forms that require administrator-only access have the appropriate .htaccess file specifying that only users from the www-admin group have access to that form. When creating a user on the system, not only does the users database table have to be updated with the user's ID, full name, E-mail address, and telephone number, but a corresponding entry for the user name and password must be maintained for the .htpasswd and .htgroup files for server authentication. The database query tool knows nothing of the password files, so within the Manage Users/Internal Staff form (a form to create users) is a hidden field that specifies an external CGI script to run after the database operation is executed, where in this case the same CGI input is passed to an update-user script that updates the server authentication files corresponding to the selected database operation (insert, update, or delete).
As stated earlier, the report tools generate textual and graphical reports across the databases corresponding to the current level in the system's menu hierarchy. For example, the report tools at the Examination of Submissions menu provide reports on the submissions, and the report tools at the Users/Internal Staff Processes menu provide reports on the users of the system. Some examples of the output generated from the report tools are discussed in the Appendix. The report tools are divided into two classifications: standard report tools and ad hoc report tools.
Standard report tools are specialized tools that generate pre-defined reports, all of which are created from embedded SQL statements and meta-information (database tables, primary keys, formatting/layout instructions) sent as input to a database report tool to access the Oracle database and format the output in dynamically generated HTML documents.
The ad hoc report tools provide general tools that allow users to make up their own reports and queries. There are several types of ad hoc reports:
Row Reports provide tools that allow users to query databases and produce one-row-at-a-time reports. From a given query, a single row or record of the database is displayed in the filled-in form as in the example in Figure 4.
Tabular Reports provide tools that allow users to query databases and produce reports in tabular form as an ASCII file or a HTML 3.0 document with table definitions. One such tabular report is that in which a user enters a SQL statement to query the database directly.
Graphics Reports provide tools that allow users to query databases and produce reports with both graphics (e.g., bar plots, line plots, pie charts, surface plots) and tabular formats. The underlying software that generates the graphical plots is the Interactive Data Language (IDL) from which GIF images are dynamically created from the queried data. This process is described in more detail in an earlier paper  that provides a graphical interface to scientific data on the Web.
We have presented a description of the peer review process of the IW3C2 and how an electronic management system brings all the relevant information (papers, evaluations, and reports) together for quick access by the conference committee and reviewers from remote locations. This system provides a working solution to the electronic peer review problem. The EMS provides a model for automating complex processes where many people need to create and manage large amounts of information. A WWW-based interface was introduced that provides access to information stored in an Oracle database, and the hierarchy of process steps in a hypertext menu structure breaks down a complex set of processes into a manageable step-by-step list of operations.
Thanks to Michael Shillinger for his insight on the peer review process in general and Tim Berners-Lee for his insight on the peer review process of the IW3C2 in particular. Many thanks to the authors who submitted papers into the system for the WWW4 and WWW5 conferences and especially to the reviewers and committee members for using the system and evaluating all the papers. Portions of the EMS software were made available to NASA as-is by courtesy of Advanced Applications Corporation (NAS5-38060), Grafikon Ltd. (NAS5-32507), and REI Systems (NAS5-31455).
The following figures were generated as graphical reports from the WWW4 EMS after the peer review process was officially completed, which was on October 9, 1995 when the final versions of the papers were submitted.
Figure 5: Number of papers uploaded by submission dates
The submission deadline was July 17, 1995. However, many papers were submitted late; in fact, nearly 45% of the papers were late. As many more papers are being submitted to each WWW Conference, there will be a need for a stricter policy on late papers with an absolute deadline where only exempted papers are permitted a late submission. Further analysis of the data shows that a greater percentage of the earlier submitted papers were accepted (46%) as opposed to the late papers (18%) since the reviewers had more time to review them.
Figure 6: Number of papers per conference topic
The subjects covered by the papers were diverse with a reasonable distribution as shown in Figure 6 and is also in the following table:
1. Authoring Environments 18 2. Charging and Payment Protocols 3 3. Commercial Use 5 4. Computer-Based Training and Teaching 17 5. Consistency, Integrity and Security 6 6. Design Techniques for Web Applications 11 7. Information Representation & Modeling 9 8. Integrating Object-Oriented or Relational Databases with W3 12 9. Intelligent Search and Data Mining 6 10. Knowledge Representation in W3 3 11. Modeling Web Dynamics 2 12. New Applications 13 13. New Experimental, Commercial & Educational Systems 20 14. Other 12 15. Protocol Evolution and Extensions 10 16. Resource Discovery 7 17. Software for W3 Applications 8 18. Time, Event Management & Monitoring 1 19. Tools and Browsers 14 20. Tuning, Benchmarking & Performance 3 21. User & Application Interfaces 13 22. Virtual Reality in W3 4 23. Not Specified 1
Figure 7: Number of reviews assigned to each reviewer
There were 28 reviewers for almost 200 papers with at least two reviewers for each paper, so many of them had to review over 20 papers apiece.
Figure 8: Counts of overall paper suitability score for all reviews
The grading of all the papers was fair with the expected bell-shaped curve with many papers receiving an average score (3) and fewer papers having a poor (1) or outstanding (5) score.
Figure 9: Count of relevance score for all reviews
The relevance grade asked the question of whether a paper works toward the goals of the conference where 1 is unrelated, 2 is somewhat related, 3 is of moderate importance, 4 is very important, and 5 is a topic of immediate importance to the Web. An interesting aspect of this graph is that most of the papers submitted were very important and relevant to both the WWW and the goals of the conference.
Figure 10: Number of accepted and rejected papers
An abundance of papers was submitted to the conference, and many good papers were rejected because only a select few can actually be presented at the conference. Only 57 papers (or 29%) were accepted while 141 were rejected. Many papers were alternatively presented in the Poster sessions, and for other authors the committee recommended they resubmit their papers at the next conference with the suggested changes.
Barry E. Jacobs received the M.S. and Ph.D. degrees in mathematics from the Courant Institute of Mathematical Sciences and is currently a senior research computer scientist at the Goddard Space Flight Center. His current interests include generalizing and extending database research from the relational to the heterogeneous case (relational, hierarchical, network) using database logic as a framework.