parsing_scripts.platform package

Submodules

parsing_scripts.platform.adf_platform module

parsing_scripts.platform.adf_platform.main()

Parse an ADF file and extract PLATFORM information

Looks for accession number, platform name, platform type and platform description

PARAMETERS:

param1 (string): The original probe id field. If it is composed by more than one field, put all of them separated with a |. For example X|Y

param2 (bool): If True (or 1 or a non-empty string) the probe information (sequence) will be added

parsing_scripts.platform.cdf_platform module

parsing_scripts.platform.cdf_platform.main()

Parse a CDF file (Affymetrix) and extract PLATFORM information

Looks for probe set name and probe id. Please note that CDF does not contain probe sequence, for that information refer to cdf_platform_fasta.py

PARAMETERS:
None

parsing_scripts.platform.cdf_platform_fasta module

parsing_scripts.platform.cdf_platform_fasta.main()

Parse a FASTA file containing probe sequences

This script is usually used before cdf_platform.py in order to get the probe sequence information that a CDF file doesn’t provide.

PARAMETERS:
None

parsing_scripts.platform.csv_platform module

parsing_scripts.platform.csv_platform.main()

Parse a CSV file containing probe sequences

A CSV file containing probe information is parsed and probes get added to the platform. This script is usually used together with other PLATFORM scripts

PARAMETERS:

param1 (string): The probe id field

param2 (string): The probe sequence

parsing_scripts.platform.gpr_platform module

parsing_scripts.platform.gpr_platform.main()

Parse a GPR file containing PLATFORM information and probe sequences

A GPR file is a TAB-delimited file with headers and complete platform information (descriptions and probe sequences)

PARAMETERS:

param1 (int): Number of lines to skip

param2 (string): The column header to parse out the original probe id field. If it is composed by more than one field, put all of them separated with a |. For example X|Y (actual probe ids will be concatenated with dots . in that case)

param3 (string): The column header of the probe sequence you want to parse out

param4 (string): DEPRECATED - The column header to parse out the DB ‘gene_map_content’ field; if multiple seperate with a pipe | (actual probe ids will be concatenated with dots . in that case)

param5 (string): The column header to parse out probe name field. If it is composed by more than one field, put all of them separated with a |. For example X|Y (actual probe ids will be concatenated with dots . in that case)

param6 (string): The column header to parse out probe set name field

param7 (bool): Ensure that orgiginal probe id in SAMPLE_OBJECT will be unique (defaults to False)

parsing_scripts.platform.ndf_platform module

parsing_scripts.platform.ndf_platform.main()

Parse a NDF file containing probe sequences

A NDF file is an ArrayExpress file that contains probe sequences. They have a header file with X and Y position for the probe, the SEQUENCE field and a PROBE_ID field. The combination of X.Y is used to store the probe id and ensure that is a unique name

PARAMETERS:
param1 (int): Number of lines to skip

parsing_scripts.platform.soft_platform module

parsing_scripts.platform.soft_platform.main()

Parse a SOFT file and extract PLATFORM information

Looks for accession number, platform name, platform type and platform description. If True is passed as parameter it will look for probe sequence information in the data table part of the file

PARAMETERS:
param1 (bool): Read the data table information (default False)

Module contents