parsing_scripts.platform package¶
Submodules¶
parsing_scripts.platform.adf_platform module¶
-
parsing_scripts.platform.adf_platform.main()¶ Parse an ADF file and extract PLATFORM information
Looks for accession number, platform name, platform type and platform description
- PARAMETERS:
param1 (string): The original probe id field. If it is composed by more than one field, put all of them separated with a |. For example X|Y
param2 (bool): If True (or 1 or a non-empty string) the probe information (sequence) will be added
parsing_scripts.platform.cdf_platform module¶
-
parsing_scripts.platform.cdf_platform.main()¶ Parse a CDF file (Affymetrix) and extract PLATFORM information
Looks for probe set name and probe id. Please note that CDF does not contain probe sequence, for that information refer to cdf_platform_fasta.py
- PARAMETERS:
- None
parsing_scripts.platform.cdf_platform_fasta module¶
-
parsing_scripts.platform.cdf_platform_fasta.main()¶ Parse a FASTA file containing probe sequences
This script is usually used before cdf_platform.py in order to get the probe sequence information that a CDF file doesn’t provide.
- PARAMETERS:
- None
parsing_scripts.platform.csv_platform module¶
-
parsing_scripts.platform.csv_platform.main()¶ Parse a CSV file containing probe sequences
A CSV file containing probe information is parsed and probes get added to the platform. This script is usually used together with other PLATFORM scripts
- PARAMETERS:
param1 (string): The probe id field
param2 (string): The probe sequence
parsing_scripts.platform.gpr_platform module¶
-
parsing_scripts.platform.gpr_platform.main()¶ Parse a GPR file containing PLATFORM information and probe sequences
A GPR file is a TAB-delimited file with headers and complete platform information (descriptions and probe sequences)
- PARAMETERS:
param1 (int): Number of lines to skip
param2 (string): The column header to parse out the original probe id field. If it is composed by more than one field, put all of them separated with a |. For example X|Y (actual probe ids will be concatenated with dots . in that case)
param3 (string): The column header of the probe sequence you want to parse out
param4 (string): DEPRECATED - The column header to parse out the DB ‘gene_map_content’ field; if multiple seperate with a pipe | (actual probe ids will be concatenated with dots . in that case)
param5 (string): The column header to parse out probe name field. If it is composed by more than one field, put all of them separated with a |. For example X|Y (actual probe ids will be concatenated with dots . in that case)
param6 (string): The column header to parse out probe set name field
param7 (bool): Ensure that orgiginal probe id in SAMPLE_OBJECT will be unique (defaults to False)
parsing_scripts.platform.ndf_platform module¶
-
parsing_scripts.platform.ndf_platform.main()¶ Parse a NDF file containing probe sequences
A NDF file is an ArrayExpress file that contains probe sequences. They have a header file with X and Y position for the probe, the SEQUENCE field and a PROBE_ID field. The combination of X.Y is used to store the probe id and ensure that is a unique name
- PARAMETERS:
- param1 (int): Number of lines to skip
parsing_scripts.platform.soft_platform module¶
-
parsing_scripts.platform.soft_platform.main()¶ Parse a SOFT file and extract PLATFORM information
Looks for accession number, platform name, platform type and platform description. If True is passed as parameter it will look for probe sequence information in the data table part of the file
- PARAMETERS:
- param1 (bool): Read the data table information (default False)