Extract

The extract module builds phenotype extraction requests from UKB field IDs and submits them to RAP. It bridges user-facing field lists and the underlying dx extract_dataset workflow.

Scope

Function Role
extract_ls() List available fields in the RAP dataset
extract_pheno() Extract selected phenotype fields directly
extract_batch() Submit a batch extraction job for larger workflows

Workflow Role

Start with extract_ls() to confirm field availability, then use extract_pheno() for small interactive pulls or extract_batch() for long-running RAP jobs. Large cohort outputs should be handled as RAP project files and retrieved through job_*() helpers when the job finishes.

Review Focus

  • strict validation of field IDs and dataset names;
  • caching of field metadata by dataset location;
  • clear reporting of matched and unmatched fields;
  • stable output file naming for downstream job monitoring.