Operations, Staff and Quality Control ...

Computer Facilities. Data are stored primarily in a SQL server data bank that has been optimized for the longitudinal data collection in rheumatology. Physically the dual-processor server has 1 GB of memory, and over 140 gigabytes of storage. Backup of the data bank to tape is performed nightly, and off site storage of the backup tapes occurs routinely. There are three other servers that make up the entire NDB network system. These include a Web Server, Email Server, and File Sever. Currently there are 27 workstations attached to the NDB network, and Internet access is via dedicated full T1 line.

X-ray Data Bank. All x-ray films received at the NDB are scanned into a dedicated workstation using a Lumisys LS-75 scanner and DI-2000 software. Radiographs are digitized and stored at a pixel size no smaller than 100 microns. Images are backed to CD-ROMs and tape back-up.

Fax server. Six dedicated rotating telephone lines are used to access a dedicated fax server (Castelle version 6.2). The Castelle fax server is integrated with the Teleform software. The fax server is also integrated with Exchange Server and Microsoft Office, adding the flexibility to send faxes from within these applications.

NDB Staff

The Medical and research director of the NDB is Dr. Frederick Wolfe. The executive director is Rebecca Schumacher. They are supported by a staff of programmers (2), statistical staff (1), and 15 others, who direct or work in the areas specific to forms design, data acquisition, quality control, and patient-physician contact.

Data acquisition. The primary method of data acquisition is through scanned forms working in a dedicated Teleform network. Teleform software (www.cardiff.com) replaces manual data entry with the automated processing of forms and documents. Teleform is a scalable network-based system designed for high-volume data and document capture. Teleform network components — Designers, Readers, Verifiers and Scan Stations – are main components used to process NDB data.

The data collection system begins with a form that is filled out by a patient and/or physician at a distant site. This form has been prepared on a Teleform forms designer and is directly linked to fields within a SQL database. The forms completed by the patient or physician arrive at the National Data Bank (NDB) in one of two ways. They may be mailed to the NDB, in which case they are processed from the paper form, or the may be faxed to the NDB, in which case they arrive as an electronic image.

Forms that are received by mail are scanned by a high-speed forms processing scanner and converted by Teleform reader software to image files. Forms received by the NDB fax servers are transformed to We process about 1 millon pages of data each year appropriate Teleform readable images. Regardless of their source, images that are processed or produced by the Teleform software are also stored as images in an image Data Bank. Therefore, a complete record of any form sent to the NDB is stored electronically. This allows for substantial space saving, as paper forms can be discarded. In addition, images are indexed and can be accessed rapidly through NDB developed acquisition software.

Transformation of Forms to images is followed by a data verification step using Teleform verifier software. Verification is performed by 5 members of the data processing staff on individual workstations. Verification is the rate-limiting step in the data acquisition process because each field of the patient/physician forms must be passed through review by the verifying software and the NDB staff verifiers. At the time of forms development, NBD staff programmers create data and logic checks associated with each field. In addition, they can require the verifier software to stop at given fields and wait for human intervention. For example, crucial fields such as ID and date fields require human verification and approval. The verifier software also stops when check box results are confusing. At the end of the verification process, clean data are available. In addition, several set of data problems are identified and flagged. These problems include form problems that will require direct patient contact to 1) obtain additional information. 2) correct discrepancies, or 3) correct missing data. In addition, the verifying process identifies patients for whom medical records from hospitals and physicians will be required. At the end of the verifying step, data are passed to the SQL Data Bank and to control files where flags have been set to identify problem records.

A second major activity of data processing at the NDB involves follow-up with patients to correct the deficiencies noted above, but also to validate patient replies. For example, all medical records that indicate hospitalizations or malignancies require that medical records be obtained so that the diagnoses can be verified and DRGs obtained. These and other validation steps are rate limiting because they require responses from patients, physicians and hospitals. On average, it requires about 4 months for these data to be complete and integrated into the SQL Data Bank. A staff of 4 NDB data assistants is assigned to this process on a full time basis.

WebQuest: Direct entry of patient data on the Web. The NDB has developed software for direct entry of patient data via the Internet. Employing program quality control checks and branching logic, the NDB WebQuest automates high-quality multi-lingual, multi-questionnaire entry. Data are published to the NDB SQL servers. The WebQuest is often preferred by patients who have difficulty in writing because of arthritis.


Quality control. Quality control is a central activity of the data processing unit, and is integrated into each step of the data processing.

Statistical conversion and reporting. The SQL Data Bank has been deliberately designed and organized to allow optimum data acquisition and retrieval. It is not, however, suitable for complex data analysis because of the SQL organizational structure. To perform data analyses, a series of conversion programs are employed. First, using StatTransfer (www.stattransfer.com), data are converted to Stata statistical package format (www.stata.com). Data are then decoded (e.g., drug, adverse events, etc), indexes calculated (e.g., SF-36, HAQ), and other data manipulations performed. Variable labels and value labels are attached. A certification script is run that certifies there are no errors present. The data are then available for use. With transformation software (e.g., DBMSCOPY) the Data Bank can be converted to SAS, SPSS, S-Plus formats without difficulty.

Other data resources Serum and blood Data Bank. The NDB has laboratory facilities for the processing of specimens, and maintain two –80 degree freezers for the storage of blood and serum specimens. These specimens are linked to the SQL database for easy identification and retrieval.

X-Ray Data Bank. All radiographic images are scanned and retained as images. Data are available on CD-ROMS and tape, and are directly linked to the major SQL data bank.

Additional software and facilities. Barcode tags are used to monitor the life-cycle of a questionnaire Internet data submission from physician sites.

We also have an add-in Teleform module for Designer PDF+Forms, which was designed with the form filler in mind. By providing paper-to-electronic hybridization, form fillers have the ability to:

Bar-coding Bar-coding is also available, allowing for creating bar-code labels that can be used for acquisition of patient IDs, mailing information, specimen and x-ray labeling, etc.

Data Management Center (DMC)

The activities of the DMC include:
The Data Management Center has primary responsibility in the following major areas:
Data is verified with powerful software and human intervention