Wednesday, August 20, 2014

/share/apps/perl/perl-5.18.1/bin/perl /share/apps/NGS_pipeline_dev/src/mapReads_Debug.pl -i 000851 -a /Bioinformatics/Users/zfu/RNAseq_Analysis_Configs/000851_D10_K27Ac_Validation_Details/analysis.config -m /Bioinformatics/Users/zfu/RNAseq_Analysis_Configs/000851_D10_K27Ac_Validation_Details/InputMetaData.csv -s /share/apps/NGS_pipeline_dev/configs/system.config

Tuesday, August 19, 2014

PMID Classifier

1.      Open PuTTy on the desktop
2.      The host and port have been saved but they are:
a.       Host: classifier.internal.iedb.org
b.      Port: 30022
c.       Connection Type: ssh
3.      After entering 2a – 2c, click “Open.”
4.      You will be prompted for a login as (rohini) and password (classify!123).  Hit “enter” after typing each of them.  You can see characters for the username but the password will be invisible.
5.      Change directory using cd /srv/www/classifier_tool and hit “enter”
6.      Open the browser and type the address as http://classifier.internal.iedb.org:8080/classifier_tool/ and hit “enter”
a.       It will tell you it cannot find the server but it needs to be open for (7) to work
7.      To run the tool, type: python manage.py runserver classifier.internal.iedb.org:8080 into the PuTTy screen and hit “enter.”
8.      You will need to select “Try Again” on the browser page to connect to the opened site.
9.      Use the options given on the opened site to run the tool.
10.  Once the classifier is run, a link will appear from which the results can be downloaded. (Emily knows all these things).
a.       Click on the link and open the zip file.
b.      Drag the files to the query folders. 

PDB Classifier
The PDB query is run on Rohini’s computer.

1.      The query and other files are located on Rohini’s computer and placed into Places à Home Folderà bcell_textclassifierà iedb_files but there is a shortcut to iedb_files on Rohini’s desktop. 
a.       Open the iedb_files folder.
b.      Do not touch any of the lone text files (“ann results”). 
c.       The folder called “Initial catch up run for classifier” has the 7/31/12 files, which were the set of files generated after the PDB classifier was run for the first time.
2.      Open the Terminal, which is located on the taskbar.
a.       Type cd bcell_textclassifier/ [enter]
b.      Type ./runClassifier.sh [enter]
                                                  i.      Once you type [enter] you will see script.
                                                ii.      When the query is finished, the Terminal script text says “see folder iedb_files for results.”  The next line says “rohini@...”.
                                              iii.      When the script is finished running you can close the Terminal.
3.      When the script has finished running, go to the iedb_files folder
a.       If there are new PMIDs the date at the end of ann_results_pmid_2012-7-31 will be renamed with the day you ran the query and you will also see a zip file dated with the day you ran the query.  Here, “2012-7-31” is the day the query was last run and would be replaced with 2012-8-6 if there were new PMIDs on 8/6/2012.
b.      The output will be in a zip file (locate zip file icon in iedb_files folder called “ann_results_pmid_2012-7-31.zip” [or whatever the date run was]).
c.        Take the zip file and put it on the desktop.  Open the files from the desktop and send the files to your e-mail.
d.      Delete the zip file after but make sure the files you sent to yourself are correct.
4.      If new PMIDs were not found, the date at the end of the files will not be updated.


Friday, August 15, 2014

7.2 Export Wiggle Files
MEDIPS allows to export genome wide coverage pro les as wiggle les for visualization in common genome browsers.
> MEDIPS.exportWIG(Set = hESCs_MeDIP[[1]], file = "hESC.MeDIP.rep1.wig",
+ format = "rpkm", descr = "")
❼ Set: a MEDIPS or COUPLING SET. In case of a COUPLING SET, the
format parameter must be set to pdensity because in this case a sequence
pattern (e.g. CpG) density pro le will be exported.
❼ file: the output le name
❼ format: can be either count or rpkm for a MEDIPS SET or pdensity for
a COUPLING SET.
❼ descr: a track description for the wiggle le


Thursday, August 14, 2014

Two ideas for HLA typing

1. Mapping to Genomic DNA
2. using exact match instead of soft clips
3. change trimming methods
4. add QC step
5. homozygous parameter in exon

Todo:

1. try another trimming methods
2. try homozygous parameter each exon
3. investigate the reads that can map to both alleles

Wednesday, August 13, 2014

2014-08-13

1. commit change to branch
2. deploy to the /share/apps/NGS_pipeline_dev
3. run mapReads_Debug.pl with system.config


2014-08-13

USING "-s /share/apps/NGS_pipeline_dev/configs/system.config" when debug pipelines


2014-08-13

Mapping Pipeline Error:

INFO: 2014/08/13 07:54:17 Mapping.pm (3112): Report location: /Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/report.html
INFO: 2014/08/13 07:54:17 Mapping.pm (3137): Searching /Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13 for .bw files
INFO: 2014/08/13 07:54:17 Mapping.pm (3158): Folder: /Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4.bam / Bigwig file:/Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4.bam/accepted_hits.sorted.bw
INFO: 2014/08/13 07:54:17 Mapping.pm (3171): Folder: /Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4_unique.bam / Bigwig file:/Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4_unique.bam/accepted_hits.sorted.bw
WARN: 2014/08/13 07:54:17 Pipeline.pm (1017): Creating new parameter 'ALL_TRACKS_URL' and setting its value to 'http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&position=chr19&hgt.customText=https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/allTracks.txt'
WARN: 2014/08/13 07:54:17 Pipeline.pm (1017): Creating new parameter 'ALL_TRACKS_UNIQUE_URL' and setting its value to 'http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&position=chr19&hgt.customText=https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/allTracksUnique.txt'
Use of uninitialized value $bowtie_dir in concatenation (.) or string at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Pipeline/Mapping.pm line 3323.
INFO: 2014/08/13 07:54:17 Mapping.pm (3323): premapping_dir:001_premapping_filter tophat:002_tophat bowtie:  low_complexity:003_low_complexity_filter bigwig:005_bigwig bamMetrics:004_bam_metrics HTSeq:006_HTSeq calc_rpkm:007_calc_rpkm
Use of uninitialized value $bowtie_dir in concatenation (.) or string at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Pipeline/Mapping.pm line 3328.
INFO: 2014/08/13 07:54:17 Mapping.pm (3360): Error log was found /Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/mapHiSeq2500Reads.err
INFO: 2014/08/13 07:54:18 Mapping.pm (3373): Dust Filtered Reads: 582601
readline() on closed filehandle $INF at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Utils.pm line 36.
Use of uninitialized value $value in scalar chomp at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Utils.pm line 39.
Use of uninitialized value in addition (+) at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Pipeline/Mapping.pm line 3402.
INFO: 2014/08/13 07:54:18 Mapping.pm (3413): Adaptor Reads: 0
INFO: 2014/08/13 07:54:18 Mapping.pm (3416): Good Illumina Reads: 49979864
INFO: 2014/08/13 07:54:18 Mapping.pm (3424): 00002_13: JrWen_RNAseq_BC_13
INFO: 2014/08/13 07:54:18 Mapping.pm (3425): Mapped reads: 33334388
INFO: 2014/08/13 07:54:18 Mapping.pm (3426): Uniquely Mapped reads: 30441021
INFO: 2014/08/13 07:54:18 Mapping.pm (3477): Expected 51330230 Total Reads the sum of the parts is 51330230
INFO: 2014/08/13 07:54:18 Mapping.pm (3478): Expected 100% the sum of the percentages is 100.000
Can't locate object method "get_param" via package "https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4.bam/accepted_hits.sorted.bw" (perhaps you forgot to load "https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/Mapping/000854_JrWen_BC13_Details_TEST_WU/JrWen_RNAseq_BC_13/005_bigwig/accepted_hits_filtered_dust_4.bam/accepted_hits.sorted.bw"?) at /Bioinformatics/apps/NGS_pipeline_dev/src/NGSPipeline/Pipeline/Mapping.pm line 4034.

Monday, August 4, 2014

Issue of new NGS pipeline

1. 0% mappability
2. allTrack files doesn't work
3. some wrong adaptor URL in report.html
4. wrong URL in ucsc browser link


Friday, August 1, 2014

Results folder is here:
Y:\NGS_analyses\automated\RNA-Seq\Mapping\000775_JrWen_BC13_Details


Running folder is here, where you can find the .config file + MetaData:
Y:\Groups\core\hiseq_raw_data\140529_D00361_0050_AH9A9MADXX_5_29_14_JiKa_RRBS_JrWen_BC13

Command:
/share/apps/perl/perl-5.18.1/bin/perl /share/apps/NGS_pipeline/src/mapReads.pl -a 
/Bioinformatics/NGS_analyses/ad_hoc/Groups/core/hiseq_raw_data/140529_D00361_0050_AH9A9MADXX_5_29_14_JiKa_RRBS_JrWen_BC13/JrWen_BC13.config -m /Bioinformatics/NGS_analyses/ad_hoc/Groups/core/hiseq_raw_data/140529_D00361_0050_AH9A9MADXX_5_29_14_JiKa_RRBS_JrWen_BC13/InputMetaData.csv


/share/apps/perl/perl-5.18.1/bin/perl /Bioinformatics/Users/zfu/Source_Code/NGS_Pipeline/RNAseq/20140801/src/mapReads.pl -a /Bioinformatics/Users/zfu/Source_Code/NGS_Pipeline/RNAseq/20140801/input/JrWen_BC13.config -m /Bioinformatics/Users/zfu/Source_Code/NGS_Pipeline/RNAseq/20140801/input/InputMetaData.csv


"https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/QC/000802_7_22_14_GrSe03_NXT02_VerA/SCT6_TU22_ThSTAR_BC_N502_N701_well1_CTCTCTAC-CTCTCTAT_L001_R1_001_fastqc/fastqc_report.html"

"../../QC/000802_7_22_14_GrSe03_NXT02_VerA/SCT6_TU22_ThSTAR_BC_N502_N701_well1_CTCTCTAC-CTCTCTAT_L001_R1_001_fastqc/fastqc_report.html"

"https://informaticsdata.liai.org/NGS_analyses/automated/RNA-Seq/Mapping/000803_GrSe03_NextFXP02_VerA_Details/SCT6_TU22_ThSTAR_BC_N502_N701_well1/001_premapping_filter/premapping_counts.txt"

"SCT6_TU22_ThSTAR_BC_N502_N701_well1/001_premapping_filter/premapping_counts.txt"

'ANALYSIS_ARCHIVE_QC' and setting its value to '/Bioinformatics/NGS_analyses/automated/RNA-Seq/QC'

'/Bioinformatics/NGS_analyses/automated/RNA-Seq/QC' -> '../../QC'

MASTER_RESULTS_DIR_MAPPING=/Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000803_GrSe03_NextFXP02_VerA_Details

'/Bioinformatics/NGS_analyses/automated/RNA-Seq/Mapping/000803_GrSe03_NextFXP02_VerA_Details' + "/" -> ''