John158
Badges: 6
Rep:
?
#1
Report Thread starter 4 months ago
#1
Hello can anyone help with this? I don't really understand coding but just need to get through this month so I can be done with it.

Write a Python function named 'filter_seq' in the following code cell that takes a list of DNA sequences as an argument and returns a list containing only those sequences that pass the following two criteria:

The sequence contains only the nucleotide letters A, C, G or T, or their lowercase equivalents, and no ambiguous nucleotides (N or n).
The sequence must be exactly 72 nucleotides long.

In addition:

Your function must accept DNA sequences in the argument to be in lowercase, UPPERCASE or a mixture of both. All sequences that meet the criteria must be returned in UPPERCASE.
Your function must have a valid function docstring (any text is acceptable).


#my attempt. where it says indent there is one and the word is not in the code.

def filter_seq(dna_seqs):
(indent) for dna in dna_seqs: if len(dna) == 72 and 'N, n' not in dna:
(indent) print(filter_seq(test_seqs_N))
Last edited by John158; 4 months ago
0
reply
0le
Badges: 21
Rep:
?
#2
Report 4 months ago
#2
(Original post by John158)
Hello can anyone help with this? I don't really understand coding but just need to get through this month so I can be done with it.

Write a Python function named 'filter_seq' in the following code cell that takes a list of DNA sequences as an argument and returns a list containing only those sequences that pass the following two criteria:

The sequence contains only the nucleotide letters A, C, G or T, or their lowercase equivalents, and no ambiguous nucleotides (N or n).
The sequence must be exactly 72 nucleotides long.

In addition:

Your function must accept DNA sequences in the argument to be in lowercase, UPPERCASE or a mixture of both. All sequences that meet the criteria must be returned in UPPERCASE.
Your function must have a valid function docstring (any text is acceptable).


#my attempt. where it says indent there is one and the word is not in the code.

def filter_seq(dna_seqs):
(indent) for dna in dna_seqs: if len(dna) == 72 and 'N, n' not in dna:
(indent) print(filter_seq(test_seqs_N))
Code:
def filter_seq(dna_seqs):
    for dna in dna_seqs:
        if len(dna) == 72 and 'N, n' not in dna:
            print(filter_seq(test_seqs_N))
So I am guessing dna_seqs is a list. You then use a for-each loop to go through the list. This is a good start. In the question there are four requirements:
(1) The sequence contains only the nucleotide letters A, C, G or T, (2) or their lowercase equivalents, (3) and no ambiguous nucleotides (N or n). (4) The sequence must be exactly 72 nucleotides long.
You've coded two of the requirements but have not included requirement 1 or 2. Now you need to have another two if statements (or one and use the and operator) for those requirements.

Inside the final if block statement, you need to reduce the dna strand to upper case letters. Assuming dna is just a string, then use the .upper() method (outlined below for the lower method but same idea):
https://stackoverflow.com/questions/...ring-in-python

Now we need a way to store the dna we want. Create an empty list in Python. This is done as follows:
Code:
myList = []
Now put this line of code before you start the for-each loop. Now all you need to do, inside the last nested if statement, use the .append() method to add appropriate dna items to the list we've called here, myList as so: myList.append(dna). which in words says to add dna to myList.

Create a docstring as described here:
https://www.datacamp.com/community/tutorials/docstrings-python
Last edited by 0le; 4 months ago
0
reply
John158
Badges: 6
Rep:
?
#3
Report Thread starter 4 months ago
#3
Code:
def filter_seq(dna_seqs):
  for dna in dna_seqs: 
     if len(dna) == 72 and 'N, n' not in dna: 
      if dna.upper and dna.lower in dna:
           my_list = [] 
           my_list.append(dna) 
        print(filter_seq(dna))
Okay I was hoping it would be something like this but I think the formatting is probably wrong or I've done it wrong altogether.
0
reply
0le
Badges: 21
Rep:
?
#4
Report 4 months ago
#4
(Original post by John158)

Code:
def filter_seq(dna_seqs):
  for dna in dna_seqs: 
    if len(dna) == 72 and 'N, n' not in dna: 
      if dna.upper and dna.lower in dna:
          my_list = [] 
          my_list.append(dna) 
        print(filter_seq(dna))
Okay I was hoping it would be something like this but I think the formatting is probably wrong or I've done it wrong altogether.
Well my_list is in the wrong place for starters. Think about what that is doing. Every time you enter the if statement, you assign the variable name my_list to an empty list, and then add something to that list. This repeats every time so that the most entries your list will have is one. You need to create the empty list outside the for-each loop.

Secondly, your second if statement is incorrect. First you missed the brackets, (), after each method. dna.upper() creates a string, lets call it dnaUpper, which is nearly the same as dna but all characters are upper case. dna.lower() creates a string, lets call it dnaLower, which is nearly the same as dna but this time the characters are all in lower case.

So the entire if statement is always false. This is because it is checking whether both dnaLower and dnaUpper is in dna. This can never be true. The dna strand cannot simultaneously all be lower case characters and upper case characters at the same time.

You only need to use the .upper() method. It does not need to be in an if statement. It needs to be in the scope of the if statement:

Code:
if a > b:
    #this is the local scope of the if statement. 
    #If the if statement is true, then any lines here will be read next in the code. 
    c = c + 1
You are still not checking if dna contains only A, C, G or T and a c g t. Maybe it is already implicit in your particular problem by confirming dna does not contain N, n?

I am not convinced that your print statement is correct either. What you want is to either print my_list, or use a return statement, so that when someone calls the function, my_list is returned.

Do you want to 1) print the list or 2) do you want to print out each valid dna strand within the list or 3) do you want to just return a list to the user and let them decide? It is not clear from your initial post.
Last edited by 0le; 4 months ago
0
reply
X

Quick Reply

Attached files
Write a reply...
Reply
new posts
Back
to top
Latest
My Feed

See more of what you like on
The Student Room

You can personalise what you see on TSR. Tell us a little about yourself to get started.

Personalise

What factors affect your mental health the most right now?

Anxiousness about lockdown easing (146)
4.87%
Uncertainty around my education (442)
14.76%
Uncertainty around my future career prospects (337)
11.25%
Lack of purpose or motivation (418)
13.96%
Lack of support system (eg. teachers, counsellors, delays in care) (139)
4.64%
Impact of lockdown on physical health (180)
6.01%
Loneliness (257)
8.58%
Financial worries (109)
3.64%
Concern about myself or my loves ones getting/having been ill (122)
4.07%
Exposure to negative news/social media (135)
4.51%
Lack of real life entertainment (162)
5.41%
Lack of confidence in making big life decisions (265)
8.85%
Worry about missed opportunities during the pandemic (283)
9.45%

Watched Threads

View All