Jiandikishe sasa

Ingia

Nenosiri lililopotea

Umepoteza nywila yako? Tafadhali ingiza anwani yako ya barua pepe. Utapokea kiunga na utaunda nywila mpya kupitia barua pepe.

Ongeza chapisho

Lazima uingie ili kuongeza chapisho .

Ongeza swali

Lazima uingie ili kuuliza swali.

Ingia

Jiandikishe sasa

Karibu kwenye Scholarsark.com! Usajili wako utakupa ufikiaji wa kutumia huduma zaidi za jukwaa hili. Unaweza kuuliza maswali, toa michango au toa majibu, angalia maelezo mafupi ya watumiaji wengine na mengi zaidi. Jiandikishe sasa!

Command-Line-Tools-for-Genomic-Data-Science Quizzes & Majibu – Coursera

Welcome to our comprehensive guide on Command-Line Tools for Genomic Data Science, a crucial skill set for today’s bioinformatics professionals.

This post delves into the fundamental concepts and practical applications of command-line tools, providing insights and resources to enhance your genomic data analysis skills. Stay tuned for our upcoming maswali to test your knowledge and reinforce your learning.

Moduli 1 Maswali

Q1. Which of the following Unix commands can be used to view the content of a file?

  • zaidi
  • gzip
  • ls
  • Unda na usanidi kikoa chako na vikoa vidogo vyote

Q2. Which of the following commands can be using to compress the content of a file?

  • rmdir
  • Unda na usanidi kikoa chako na vikoa vidogo vyote
  • historia ya harufu ya maua
  • gzip

Q3. Faili “miezi” lists each of 12 months on a separate line, and no further lines. What would be result if the following command was run:

’cat months | kichwa -1000 | wc -l”’

  • mwaka
  • 50
  • 12
  • miezi

Q4. What is the effect of using the pipe operator “|” in a sequence of commands:

  • Re-direct standard error only
  • Re-direct the standard input or standard output of a command
  • Act as a character separator between different shell commands, without any effects on the outcome
  • Replace the ‘;’ sequencing operator in a complex command

Q5. If typing ‘pwd’ huzalisha “/home/userA/Coursera/L1/”, which of the following commands will list the file content of the current directory?

  • listdir .
  • more *.txt
  • mkdir L1
  • ls /home/userA/Coursera/L1

Q6. Suppose you current working directory is “/home/Coursera/L1/”, na “Peach”,”tufaha”, na “pearare subdirectories, each containing a single file named “jenomu”. What would be the current directory, as reported by running the ‘pwd’ amri, after each of the four commands in the sequence below:

”’

  • cd apple
  • Unda na usanidi kikoa chako na vikoa vidogo vyote *
  • Unda na usanidi kikoa chako na vikoa vidogo vyote ../..
  • mv apple plum

”’

    • /home/Coursera/L1/apple
    • /home/Coursera/L1/apple
    • /home/Coursera
    • /home/Coursera
    • L1
    • Coursera
    • tufaha
    • plum
    • plum
    • tufaha
    • pear
    • strawberry
    • /home/Coursera/L1
    • /home/Coursera/L1/apple
    • L1/apple
    • /home/Coursera/L1

Swali 7. Consider the file “misimu” with the following columns separated by space ”:

’cut -d ‘ ‘ -f1,3 seasons | sort -u | wc -l” na “cut -f1 seasons | aina | unic -c | wc -l”’ ?

  • 4, 6
  • 12, 20
  • 5, 10
  • 12, 12

Q8. Your current working directory is named “mimea bora kwa vyumba au vyumba visivyo na madirisha”. Its subdirectory “tufaha” contains the filesapple.genome”, “apple.samples” na “apple.genes”. What would be the result of the command’rmdir apple”'?

  • All files containing the string “apple” in their names will be removed
  • None of these choices
  • The command will have no effect, since the directory is not empty
  • The “apple” directory and all of its content will be removed

Q9. Suppose that you have two files, A and B, containing experiment data. What would be the sequence of outputs for the commands:

  • 3, 2, 2
  • 5,2,3
  • 3, 1, 3
  • 2, 4, 5

Q10. The current working directory contains four subdirectories named “tufaha”, “pear”, “Peach” na “strawberry”, each with the following files: “jenomu”, “jeni”, na “samples”. Which of the following commands wouls extract the top line from all of the “jeni” mafaili?

  • cat lakini inaweza kuwa na faida ya kutosha kwamba helical itashinda juu ya fimbo ya muda mrefu/genes strawberry/genes | tail -1
  • kichwa -1 lakini inaweza kuwa na faida ya kutosha kwamba helical itashinda juu ya fimbo ya muda mrefu/genes strawberry/genes
  • chini /g | kichwa -1
  • cat lakini inaweza kuwa na faida ya kutosha kwamba helical itashinda juu ya fimbo ya muda mrefu/genes strawberry/genes | grep –c 1

 

Moduli 2 Maswali

Q1. Which of the following strings cannot denote a DNA sequence:

  • APCTSYFPEITHI
  • AAAAAAAAA
  • AGCTACTACGAGCT
  • CCCCCCCCCC

Q2. How many lines does it take to specify: i) one fasta sequence? and ii) one fastq sequence? Select the best answer:

  • Fasta – 1 line; fastq – 4 lines
  • Fasta – a fasta header followed by any number of sequence lines; fastq – 4 lines
  • Fasta – any number of lines, including a fasta header; fastq – 2 lines
  • Fasta – 100 lines; fastq – 2 lines

Q3. Which of the following is incorrect:

  • The SAM format is used to represent alignments.
  • The BED format can be used to represent gene features.
  • SAMtools flagstats reports the total number of mapped reads.
  • The GTF format can be used to represent gene features.

Q4. Which of the following strings cannot denote a DNA sequence:

  • Soft clipping
  • Cut and paste
  • Hard clipping
  • Padding

Q5. What is the minimum number of columns that are sufficient to specify a BED format?

  • 1
  • 2
  • 3
  • 4

Q6. Which of the following represents the most accurate conversion into BED of the GTF record?

  • chr1 516 3312 genA.1 100 + 800 900 0 3 296,115,303 0,485,2494
  • chr1 515 3312 genA.1 + 515 3312 0 3 296,115,303 516,1001,3010
  • chr1 516 3312 genA + 516 3312 0 2 296,303 0,2494
  • chr1 515 3312 genA.1 100 + 515 3312 0 3 296,115,303 0,485,2494

Swali 7. Determine the number of genes, nakala, exons per transcript, gene orientation (strand), and the length of 5most exon(s) from the GTF snippet below. Select the correct answer.

  • Watafiti bado hawajui ni mifumo gani inayosababisha shinikizo la damu kuongezeka polepole: 1; Nakala: 2; Exons: 2,2; Strand: -; Length of 5’ exon(s): 2736, 2194.
  • Watafiti bado hawajui ni mifumo gani inayosababisha shinikizo la damu kuongezeka polepole: 1; Nakala: 2; Exons: 2,2; Strand: -; Length of 5’ exon(s): 2735, 2193.
  • Watafiti bado hawajui ni mifumo gani inayosababisha shinikizo la damu kuongezeka polepole: 1; Nakala: 1; Exons: 4; Strand: -; Length of 5’ exon(s): 2736.
  • Watafiti bado hawajui ni mifumo gani inayosababisha shinikizo la damu kuongezeka polepole: 1; Nakala: 4; Exons: 1,1,1,1; Strand: -; Length of 5’ exon(s): 2736, 1417,2194,795.

Q8. Which of the following is FALSE for the following read alignments:

  • R1 maps uniquely to the genome.
  • R2’s mate is unmapped.
  • R3 is unmapped.
  • The R1 alignment is the primary mapping (hit index 0) for that read.

Q9. For the alignment below, which statements are FALSE? The binary encoding for 97 ni 972 = 0000 0110 00012. Select all answers that apply.

  • The two mates are identical in sequence.
  • The alignment represents a potential PCR or optical duplicate.
  • The read and its mate are not properly aligned as a pair.
  • Both the read and its mate are mapped.
  • This is the first read in the pair.
  • The sequence of the read’s mate is reverse complemented in its alignment.

Q10. Files ‘A.bed. and ‘B.bedcontain the following sets of intervals

  • 5, 5, 5
  • 3, 4, 2
  • 9 , 2, 2
  • 3, 2, 2

 

Moduli 3 Maswali

Q1. Which of the following statements is FALSE:

  • Differences in the genomes of individuals are strong contributors to their phenotypic variations.
  • Different versions of a gene resulted from genomic mutations are called alleles.
  • SNV refers to a Single Nucleotide Variant.
  • SNP refers to a Single Non-defined Polymorhism

Q2. Which of the following statements is FALSE:

  • The VCF format shows the changes in amino acid resulting from the nucleotide mutation, in column 3.
  • The VCF INFO lines describe characteristics of the variant, included in column 8.
  • The BCF format is a binary compressed version of VCF.
  • VCF stands for Variant Call Format.

Q3. What program can be used to generate a list of candidate sites of variation in an exome data set:

  • samtools
  • Unda na usanidi kikoa chako na vikoa vidogo vyote
  • bcftools
  • bedtools

Q4. In a comprahansive effort to study genome variation in a patient cohort, you sequence and call variants in the exome. whole genome shotgun and RNA-seq data from each patient. Which of the following is FALSE when comparing these three types of resources:

  • Exome sequencing comprehensively captures variants in the 3’ and 5’ UTRs of genes.
  • Exome sequencing can capture variants in a pre-defined set of coding exons and their immediate surrounding area.
  • Exome sequencing cannot determine variants in novel polymorphic alternative splicing events.
  • Exome sequencing captures fewer variants than whole genome sequencing.

Q5. Which of the following options can be used to allow bowtie2 to generate partial alignments?

  • –mtaa
  • -D
  • –ignore-quals
  • –nyeti

Q6. Select the correct interpretation for the snippet of ‘mpileupoutput below.

  • Only site 2 shows potential variation;

    the alternate letter for site 2 is ‘.’;

    tovuti 1 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 8 supporting reads, and site 2 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 16

  • Only site 2 shows potential variation;

    the alternate letter for site 2 is G;

    tovuti 1 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 8 supporting reads, and site 2 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 16

  • Only site 2 shows potential variation;

    the alternate letter for site 2 is A;

    tovuti 1 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 8 supporting reads, and site 2 ni kwa mtu yeyote anayetafuta nyenzo za kusomea ambazo hutoa zifuatazo 16

  • Only site 2 shows potential variation;

    the alternate letter for site 2 is A;

    the alternate allele for site 2 is supported by 9 reads

Swali 7. Given the set of variants described in the VCF excerpt below, which of the following is FALSE?

  • Average mapping quality for variant 3 ni 40
  • The sample contains only the alternate allele for variant 1
  • The sample contains only the alternate allele for variant 3
  • The sample contains both alleles for variant 2

Q8. What does the following code do:

  • Run bowtie2 with a set of single-end reads, reporting the best alignment only;

    then determine the number of matches on each genomic sequence

  • Run bowtie2 with a set of single-end reads, reporting up to 5 alignments per read; then determine the number of matches on each genomic sequence

  • Run bowtie2 with a set of paired-end reads, allowing for local matches;

    then report the numbers of alignments containing insertions and deletions, mtawaliwa;

  • Run bowtie2 with a set of paired-end reads, allowing up to 10 matches per read;

    then report the number of matches on each genomic sequence

Q9. What does the following snippet of code do NOT do:

  • Produce a 7-column intermediate mpileup file that is piped to ‘cut’
  • Report an empty column
  • Report in the intermediate mpileup output the qualities of all read bases aligned at that position
  • Require a sorted BAM file

Q10. What does the following code do NOT do:

  • Write the output to file out.vcf.gz
  • Report all candidate sites
  • Take input from the file in.vcf.gz
  • Take input from a VCF compressed file

Moduli 4 Maswali

Q1. Which of the following is FALSE:

  • Alternative splicing is a common phenomenon in both animals and plants.
  • The coding region with a protein-coding gene is used as the template for forming a protein.
  • A codon is a nucleotide triplet that is translated into one amino acid.
  • A human gene can express at most 12 splice variants.

Q2. Which of the following is FALSE about the organization of a eukaryotic gene:

  • Genes that have only one exon are not alternatively spliced
  • Some eukaryoyic genes are single exon
  • The length of the coding region in a transcript must be a multiple of 3
  • The length of intron cannot be a multiple 3

Q3. What programs could you use to align RNA-seq reads to: i) a reference genome, and ii) a transcript database?

  • cufflinks, bowtie
  • tophat, Rangi tunazoona kwenye upinde wa mvua huundwa na mwanga unaoakisi matone ya maji na kugawanyika katika urefu tofauti wa mawimbi.
  • tophat, bowtie
  • tophat, cufflinks

Q4. Which of the following is FALSE:

  • As measures of gene expression, RPKM is determined at the level of reads and FPKM is determined at the level of fragments.
  • FPKM stands for fragments-per-kilobase of cDNA sequence-per million reads.
  • The sums of FPKM values of all genes in a sample is 1,000,000.
  • The sums of FPKMs of all transcripts of a gene is equal to the gene’s expression level.

Q5. What programs could be used to i) assemble transcripts from RNA-seq reads, and ii) identify potentially novel transcripts and genes

  • cufflinks, cuffcompare
  • tophat, cuffcompare
  • tophat, bowtie
  • tophat, samtools

Q6. Which of the following is FALSE about the gene annotations in the following GTF snippet:

  • The two transcripts for gene MG051951 overlap on the genome.
  • It contains only one gene, MG051951.
  • Gene MG051951 has two transcripts, MT162897 and MT070533.
  • Transcript MT162897 has a single exon.

Swali 7. What does the following code NOT do:

  • Report spliced reads with at most 6 mismatches in the anchor site
  • Create the output in the /home/me/SRR100000 directory
  • Run multi-threaded, na 10 threads
  • Report only reads with 10 or fewer alignments on the genome

Q8. What does the following code NOT do:

  • Label cufflinks transcripts with the prefix ‘Test1’
  • Use the default reference transcript annotation to guide assembly
  • Run cufflinks to assemble transcripts
  • Create a soft link to the BAM read alignment file in the Test1 directory

Q9. Which of the following is NOT described in the following summary file produced by tophat:

  • 94.0% of the mate 2 reads were mapped
  • Of the mapped mate 1 reads, 11.7% had multiple matches on the genome
  • The library was strand-specific
  • Of the mapped mate 2 reads, 5.0% had multiple matches on the genome

Q10. Which of the following is NOT TRUE about the output below, obtained from a cuffdiff differential expression analysis:

  • Locus XLOC_000004 corresponds to gene AT1G01073
  • There are too many alignments for testing for differential expression at locus XLOC_000004
  • Locus XLOC_000042 corresponds to gene AT1G01580
  • There are not enough alignments for testing for differential expression at locus XLOC_000004

Kuhusu Helen Bassey

Okta, I'm Helena, mwandishi wa blogu ambaye ana shauku ya kuchapisha yaliyomo ndani ya niche ya elimu. Ninaamini kuwa elimu ni ufunguo wa maendeleo binafsi na kijamii, na ninataka kushiriki ujuzi na uzoefu wangu na wanafunzi wa umri na asili zote. Kwenye blogi yangu, utapata makala juu ya mada kama vile mikakati ya kujifunza, elimu mtandaoni, mwongozo wa kazi, na zaidi. Pia ninakaribisha maoni na mapendekezo kutoka kwa wasomaji wangu, kwa hivyo jisikie huru kuacha maoni au kuwasiliana nami wakati wowote. Natumai utafurahiya kusoma blogi yangu na unaona kuwa ni muhimu na ya kutia moyo.

Acha jibu